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THERAPEUTIC COMPOSITIONS 

RELATED APPLICATIONS 
The present application claims the benefit of Application No. 60/452,682, filed 
5 March 7, 2003; Application No. 60/462,894, filed April 14, 2003; and Application 

No. 60/465,665, filed April 25, 2003; Application No. 60/463,772, filed April 17, 2003; 
Application No. 60/465,802, filed AprU 25, 2003; Application No. 60/493,986, filed 
Augvist 8, 2003; Application No. 60/494,597, filed August 1 1, 2003; Application No. 
60/506,341, filed September 26, 2003; Application No. 60/5 18,453, filed November 7, 2003; 
10 Application No. 60/454,265, filed March 12, 2003; Application No. 60/454,962, filed Mai-ch 
13, 2003; ApphcationNo. 60/455,050, filed March 13, 2003; Application No. 60/469,612, 
filed May 9, 2003; Application No, 60/510,246, filed October 9, 2003; Application 
No. 60/510,318, filed October 10, 2003. The contents of these provisional applications are 

* 

hereby incorporated by reference in their entirety. 

15 

TECHNICAL FIELD 

The invention relates to RNAi and related methods, e.g., methods of making and 
using iRNA agents. 

BACKGROUND 

20 RNA interference or "RNAi" is a term initially coined by Fire and co-workers to 

describe the observation that double-stranded RNA (dsRNA) can block gene expression 
when it is inti-oduced mto woiins (Fhe et al (1998) Nature 391, 806-81 1). Short dsRNA 
directs gene-specific, post-transcriptional silencing in many organisms, including vertebrates, 
and has provided a new tool for studying gene ftmction. RNAi may involve mRNA 

25 degradation. 



wo 2004/080406 



PCT/US2004/007070 



SUMMARY 

A number of advances related to the application of RNAi to the treatment of subjects 
are disclosed herein. For example, the invention features iRNA agents targeted to specific 
genes; palindromic iRNA agents; iRNA agents having non canonical monomer pairings; 
iRNA agents havuig particular structures or architectures e.g., the Z-X-Y or asymmetrical 
iRNA agents described herein; drug delivery conjugates for the delivery of iRNA agents; 
amphipathic substances for the delivery of iRNA agents, as well as iRNA agents having 
chemical modifications for optimizing a property of the iRNA agent. The invention features 
each of these advances broadly as well as in combinations. For example, an iRNA agent 
targeted to a specific gene can also include one or more of a palindrome, non canonical, Z-X- 
Y, or asymmetric structure. Other nonlimiting examples of combinations include an 
asymmetric structure combined with a chemical modification, or formulations or methods or 
routes of deUvery combined with, e.g., chemical modifications or architectures described 
herein. The iRNA agents of the invention can include any one of these advances, or pairwise 
or liigher order combinations of the separate advances. 

In one aspect, the invention features iRNA agents that can target more than one RNA 
region, and methods of using and making the iRNA agents. 

In another aspect, an iRNA agent includes a first and second sequence that ai-e 
sufficiently complementary to each otlier to hybridize. The first sequence can be 
complementary to a first target RNA region and the second sequence can be complementary 
to a second target RNA region. 

In one embodiment, the first and second sequences of the iRNA agent are on different 
RNA strands, and the mismatch between the first and second sequences is less than 50%, 

40%, 30%, 20%, 10%, 5%, or 1%. 

In another embodiment, the first and second sequences of the iRNA agent are on the 
same RNA strand, and in a related embodhnent more than 50%, 60%, 70%, 80%, 90%, 95%, 
or 1% of the iRNA agent is in bimolecular form. 

In another embodiment, the first and second sequences of the iRNA agent are fully 
complementary to each other. 

In one embodhnent, the fu-st target RNA region is encoded by a first gene and the 
second target RNA region is encoded by a second gene, and m another embodiment, the first 
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and second target RNA regions are different regions of an RNA from a single gene. In 
another embodiment, the first and second sequences differ by at least 1 and no more than 6 
nucleotides. 

In certain embodiments, the first and second target RNA regions are on transcripts 
5 encoded by first and second sequence variants, e.g., fnst and second alleles, of a gene. The 
sequence variants can be mutations, or polymorpliisms, for example. 

In certain embodiments, the first taiget RNA region includes a nucleotide 
substitution, insertion, or deletion relative to the second target RNA region. 

In other embodiments, the second target RNA region is a mutant or variant of the first 

1 0 target RNA region. 

In certain embodiments, the fnst and second target RNA regions comprise viral, e.g., 
HCV, or human RNA regions. The first and second target RNA regions can also be on 
variant transcripts of an oncogene or include different mutations of a tumor suppressor gene 
transcript. In one embodiment, the oncogene, or tumor suppressor gene is expressed in the 

1 5 liver. In addition, the first and second target RNA regions correspond to hot-spots for 
genetic variation. 

In another aspect, the invention features a mixture of varied iRNA agent molecules, 
including one iRNA agent that includes a first sequence and a second sequence sufficiently 
complementary to each other to hybridize, and where the first sequence is complementary to 

20 a first target RNA region and the second sequence is complementary to a second target RNA 
region. The mixture also includes at least one additional iRNA agent variety that includes a 
third sequence and a fourth sequence sufficiently complementary to each otlier to hybridize, 
and where the third sequence is complementary to a third target RNA region and the fourth 
sequence is complementary to a fourth target RNA region. In addition, the first or second 

25 sequence is sufficiently complementary to the third or fourth sequence to be capable of 

hybridizing to each other. In one embodiment, at least one, two, three or all four of the target 
RNA regions are expressed in the liver. Exemplary RNAs are transcribed from the apoB-100 
gene, glucose-6-phosphatase gene, beta catenin gene, or an HCV gene. 

In certain embodiments, the first and second sequences are on the same or different 

30 RNA strands, and the third and fourth sequences are on same or different RNA strands. 
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In one embodiment, the mixture further includes a third iRNA agent that is composed 
of the first or second sequence and the third or fourth sequence. 

In one embodmient, the first sequence is identical to at least one of the second, tliird 
and fourth sequences, and in another embodiment, the first region differs by at least 1 but no 
more than 6 nucleotides from at least one of tlie second, third and fourdi regions. 

In certain embodiments, the fu-st target RNA region comprises a nucleotide 
substitution, insertion, or deletion relative to the second, thh'd or fourth target RNA region. 

The target RNA regions can be variant sequences of a viral or human RNA, and in 
certain embodiments, at least two of the tai'get RNA regions can be on variant transcripts of 
an oncogene or tumor suppressor gene. In one embodiment, the oncogene or tumor 
suppressor gene is expressed in the liver. 

In certain embodiments, at least tv^o of the target RNA regions correspond to hot- 
spots for genetic variation. 

In one embodiment, the iRNA agents of the invention are formulated for 
pharmaceutical use. In one aspect, the mvention provides a container (e.g., a vial, syringe, 
nebulizer, etc) to hold the iRNA agents described herein. 

Another aspect of the invention features a method of making an iRNA agent The 
method includes constructing an iRNA agent that has a first sequence complementary to a 
first target RNA region, and a second sequence complementary to a second target RNA 
region. The first and second target RNA regions have been identified as being sufficiently 
complementary to each other to be capable of hybridizing. In one embodiment, the first and 
second target RNA regions are on transcripts expressed in the liver. 

In certain embodiments, the first and second target RNA regions can correspond to 
two different regions encoded by one gene, or to regions encoded by two different genes. 

Another aspect of the invention features a method of making an iRNA agent 
composition. The method includes obtaining or providing information about a region of an 
RNA of a target gene (e.g., a viral or human gene, or an oncogene or tumor suppressor, e.g., 
p53), where die region has high variability or mutational frequency (e.g., in humans). In 
addition, information about a plurality of RNA targets within Ihe region is obtained or 
provided, where each RNA target corresponds to a different variant or mutant of the gene 
(e.g., a region including the codon encoding p53 248Q and/or p53 249S). The iRNA agent is 
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constructed such that a first sequence is complementary to a first of the plurality of variant 
RNA targets (e.g., encoding 249Q) and a second sequence is complementary to a second of 
the pluraUty of variant RNA targets (e.g., encoding 249S). The first and second sequences 
are sufficiently complementary to hybridize. In certain embodmients, the target gene can be 
a viral or human gene expressed in the liver. 

In one embodiment, sequence analysis, e,g,, to identify common mutants in the target 
gene, is used to identify a region of tlie target gene that has high variability or mutational 
frequency. For example, sequence analysis can be used to identify regions of apoB-100 or 
beta catenin that have high variability or mutational frequency. In another embodunent, the 
region of the target gene having high variability or mutational frequency is identified by 
obtaining or providing genotype information about the target gene from a population. In 
another embodiment, the genotype information can be from a population suffering from a 
liver disorder, such as hepatocellular carcinoma or hepatoblastoma. 

Another aspect of the invention features a method of modulating expression, e.g., 
downregulating or silencing, a tai-get gene, by providkig an iRNA agent that has a first 
sequence and a second sequence sufficiently complementary to each other to hybridize. In 
addition, the first sequence is complementary to a first target RNA region and the second 
sequence is complementary to a second target RNA region. 

In one embodiment, the iRNA agent is administered to a subject, e.g., a human. 

hi another embodiment, the first and second sequences are between 15 and 30 

nucleotides in length. 

hi one embodiment, the method of modulating expression of the target gene further 
includes providing a second iRNA agent that has a tliird sequence complementary to a third 
target RNA region. The third sequence can be sufficiently complementary to the first or 
second sequence to be capable of hybridizing to either the first or second sequence. 

Another aspect of the invention features a method of modulating expression, e.g., 
dowegulating or silencing, a pluraUty of target RNAs, each of the plurality of target RNAs 
corresponding to a different target gene. The method includes providing an iRNA agent 
selected by identifying a first region in a first target RNA of the plurality and a second region 
in a second target RNA of the plurality, where flie first and second regions are sufficiently 
complementary to each other to be capable of hybridizing. 
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In another aspect of the invention, an iRNA agent moleciae includes a first sequence 
complementary to a first variant RNA target region and a second sequence complementary to 
a second variant RNA target region, and the first and second variant RNA target regions 
correspond to first and second variants or mutants of a target gene, hi certain embodiments, 
the target gene is an apoB-100, beta catenin, or glucose-6 phosphatase gene. 

In one embodiment, the target gene is a viral gene (e.g., an HCV gene), tumor 

suppressor or oncogene. 

In another embodiment, the first and second variant target RNA regions mclude 

allelic variants of the target gene. 

In another embodiment, the first and second variant RNA target regions comprise 

mutations (e.g., point mutations) or polymorphisms of the target gene. 

In one embodiment, the first and second variant RNA target regions correspond to 

hot-spots for genetic variation. 

Another aspect of the invention features a plurality (e.g., a panel or bank) of iRNA 
agents. Each of the iRNA agents of the plurality includes a first sequence complementary to 
a first variant target RNA region and a second sequence complementary to a second variant 
target RNA region, where the first and second variant target RNA regions correspond to first 
and second variants of a target gene. In certain embodiments, the variants are allelic variants 
of the target gene. 

Another aspect of the invention provides a method of identifying an iRNA agent for 
treating a subject. The method includes providing or obtaimng information, e.g., a genotype, 
about a target gene, providing or obtaining information about a plurality (e.g., panel or bank) 
of iRNA agents, comparing the information about the target gene to information about the 
plurality of iRNA agents, and selecting one or more of the plurality of iRNA agents for 
treating the subject. Each of the plurality of iRNA agents includes a first sequence 
complementary to a first variant target RNA region and a second sequence complementary to 
a second variant target RNA region, and the first and second variant target RNA regions 
correspond to first and second variants of the target gene. The target gene can be an 
endogenous gene of tlie subject or a viral gene. The information about the plurality of iRNA 
agents can be the sequence of the first or second sequence of one or more of the plurality. 



6 
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In certain embodiments, at least one of the selected iRNA agents includes a sequence 
capable of hybridizing to an RNA region corresponding to the target gene, and at least one of 
the selected iRNA agents comprises a sequence capable of hybridizing to an RNA region 
corresponding to a variant or mutant of the target gene. 

In one aspect, the invention relates to compositions and methods for silencing genes 
expressed in the liver, e.g., to treat disorders of or related to the Uver. An iRNA agent 
composition of the invention can be one which has been modified to alter distribution in 
favor of the liver. 

In another aspect, the invention relates to iRNA agents that can target more than one 
RNA region, and methods of using and making the iRNA agents. In one embodiment, the 
RNA is from a gene that is active in the liver, e.g., apoB-100, glucose-6-phosphatase, beta- 
catenin, or Hepatitis C virus (HCV). 

In another aspect, an iRNA agent includes a first and second sequence that are 
sufficiently complementary to each other to hybridize. The fnst sequence can be 
complementary to a first target RNA region and the second sequence can be complementary 
to a second target RNA region. For example, the first sequence can be complementary to a 
fu-st target apoB-100 RNA region and the second sequence can be complementary to a 
second target apoB-100 RNA region. 

In one embodiment, the first target RNA region is encoded by a furst gene, e.g., a 
gene expressed in the liver, and the second target RNA region is encoded by a second gene, 
e.g., a second gene expressed in the liver. In another embodiment, the first and second target 
RNA regions are different regions of an RNA from a single gene, e.g., a single gene that is at 
least expressed in the liver. In another embodiment, the first and second sequences differ by 
at least one and no more than six nucleotides. 

In another embodiment, sequence analysis, e.g., to identify common mutants in tlie 
target gene, is used to identify a region of the target gene that has high variability or 
mutational frequency. For example, sequence analysis can be used to identify regions of 
aopB-100 or beta catenin tfiat have high variability or mutational frequency. In another 
embodiment, the region of the target gene having high variability or mutational frequency is 
identified by obtaining or providing genotype infonnation about the target gene from a 
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population. In particular, the genotype information can be from a population suffering from 
a liver disorder, such as hepatocellular carcinoma or hepatoblastoma. 

In another aspect, the invention features a metiiod for reducmg apoB-100 levels in a 
subject, e.g., a mammal, such as a human. The metiiod includes administering to a subject an 
iRNA agent which targets apoB-100. The iRNA agent can be one described here, and can be 
a dsRNA that has a sequence that is substantially identical to a sequence of the apoB-100 
gene. The iRNA can be less than 30 nucleotides in length, e.g., 21-23 nucleotides. 
Preferably, the iRNA is 21 nucleotides m length. In one embodiment, the iRNA is 21 
nucleotides in length, and the duplex region of the iRNA is 19 nucleotides, hi another 
embodiment, the iRNA is greater than 30 nucleotides in length. 

In a preferred embodiment, tiie subject is treated with an iRNA agent which targets 
one of the sequences listed in Tables 5 and 6. In a preferred embodiment it targets both 
sequences of a palindromic pah provided in Tables 5 and 6. The most preferred targets are 
Usted in descending order of preferrability, m other words, the more preferred taigets are 

listed earlier in Tables 5 and 6. 

In a preferred embodiment the iRNA agent will include regions, or strands, which are 
complementary to a pair ui Tables 5 and 6. In a preferred embodiment the iRNA agent will 
include regions complementary to the palindromic paks of Tables 5 and 6 as a duplex region. 

In a preferred embodiment the duplex region of tiie iRNA agent will target a sequence 
listed in Tables 5 and 6 but will not be perfectiy complementary with the target sequence, 
e.g., it will not be complementary at at least 1 base pair. Preferably it will loave no more tiian 
1, 2, 3, 4, or 5 bases, m total, or per strand, which do not hybridize with the target sequence 

In a preferred embodiment the iRNA agent includes overhangs, e.g., 3' or 5' 
overhangs, preferably one or more 3' overhangs. Overhangs are discussed in detail 
elsewhere herem but are preferably about 2 nucleotides in length. The overhangs can be 
complementary to tlie gene sequences being targeted or can be other sequence. TT is a 
preferred overhang sequence. The first and second iRNA agent sequences can also be jomed, 
e.g., by additional bases to form a haurpm, or by other non-base luikers. 

The iRNA agent that targets apoB-100 can be administered in an amount sufficient to 
reduce expression of apoB-100 mRNA. In one embodiment, the iRNA agent is administered 
in an amount sufficient to reduce expression of apoB-100 protein (e.g., by at least 2%, 4%, 
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6%, 10%, 15%, 20%). Preferably, the iRNA agent does not reduce expression of apoB-48 
mRNA or protein. This can be effected, e.g., by selection of an iRNA agent which 
specifically targets the nucleotides subject to RNA editing in the apoB-100 transcript. 

The iRNA agent that targets apoB-100 can be administered to a subject, wherein the 
subject is suffering from a disorder characterized by elevated or otherwise unwanted 
expression of apoB-100, elevated or otherwise unwanted levels of cholesterol, and/or 
disregulation of lipid metabolism. The iRNA agent can be administered to an individual at 
risk for the disorder to delay onset of the disorder or a symptom of the disorder. These 
disorders include HDL/LDL cholesterol imbalance; dyslipidemias, e.g., famiUal combined 
hyperlipidemia (FCHL), acquired hyperlipidemia; hypercholestorolemia; statin-resistant 
hypercholesterolemia; coronary artery disease (CAD) coronary heart disease (CHD) 
atherosclerosis. In one embodiment, the iRNA that targets apoB-100 is administered to a 
subject suffering from statin-resistant hypercholesterolemia. 

The apoB-100 iRNA agent can be administered in an amount sufficient to reduce 
levels of serum LDL-C and/or HDL-C and/or total cholesterol in a subject. For example, the 
iRNA is administered in an amount sufficient to decrease total cholesterol by at least 0.5%, 
1%, 2.5%, 5%, 10% in the subject. In one embodiment, the iRNA agent is administered in 
an amount sufficient to reduce the risk of myocardial infarction the subject. 

In a preferred embodiment the iRNA agent is administered repeatedly. 
Administration of an iRNA agent can be carried out over a range of time periods. It can be 
administered daily, once every few days, weekly, or monthly. The timing of administration 
can vary fi*om patient to patient, depending on such factors as the severity of a patient's 
symptoms. For example, an effective dose of an iRNA agent can be administered to a patient 
once a month for an indefinite period of time, or until the patient no longer requires therapy. 
In addition, sustained release compositions containing an iRNA agent can be used to 
maintain a relatively constant dosage in the patient's blood. 

In one embodiment, the iRNA agent can be targeted to the liver, and apoB expression 
level are decreased in the liver following administration of the apoB iRNA agent. For 
example, the iRNA agent can be complexed vnth a moiety that targets the liver, e.g., an 
antibody or ligand that binds a receptor on tlie liver. 
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The iRNA agent, pailicularly aii iRNA agent that targets apoB, beta-catenin or 
glucose-6-phosphatase RNA, can be targeted to the liver, for example by associating, e.g., 
conjugating the iRNA agent to a lipophilic moiety, e.g., a lipid, cholesterol, oleyl, retmyl, or 
cholesteryl residue (see Table 1). Other lipophilic moieties that can be associated, e.g., 
conjugated with tlie iRNA agent include cholic acid, adamantane acetic acid, 1-pyrene 
butyric acid, dihydrotestosterone, l,3-Bis-0(hexadecyl)glycerol, geranyloxyhexyl group, 
hexadecylglycerol, bomeol, menthol, 1,3 -propanediol, heptadecyl group, pahnitic acid, 
myristic acid,03-(oleoyl)lithocholic acid, 03-(oleoyl)cholenic acid, dimethoxytrityl, or 
phenoxazine. In one embodiment, the iRNA agent can be targeted to the liver by associating, 
e.g., conjugating, the iRNA agent to a low-density lipoprotein (LDL), e.g., a lactosylated 
LDL. hi another embodiment, the iRNA agent can be targeted to the liver by associating, 
e.g., conjugating, the iRNA agent to a polymeric carrier complex with sugar residues. 

In another embodiment, the iRNA agent can be targeted to the liver by associating, 
e.g., conjugating, the iRNA agent to a liposome complexed with sugar residues. A targetmg 
agent that mcorporates a sugar, e.g., galactose and/or analogues thereof, is particularly useful. 
These agents target, in particular, the parenchymal cells of the liver (see Table 1), In a 
preferred embodiment, the targetmg moiety includes more than one galactose moiety, 
preferably two or tliree. Preferably, the targeting moiety includes 3 galactose moieties, e.g., 
spaced about 15 angstroms from each other. The targeting moiety can be lactose. A lactose 
is a glucose coupled to a galactose. Preferably, the targeting moiety includes three lactoses. 
The targeting moiety can also be N-Acetyl-Galactosamine, N-Ac-Glucosamme. A mamiose, 
or mannose-6-phosphate targetmg moiety can be used for macrophage targeting. 

The targeting agent can be linked directly, e.g., covalently or non covalently, to the 
iRNA agent, or to another delivery or formulation modality, e.g., a liposome. E.g., the iRNA 
agents with or without a targetuig moiety can be incorporated into a delivery modality, e.g., a 
liposome, with or without a targeting moiety. 

It is particularly preferred to use an iRNA conjugated to a lipophilic molecule to 
conjugate to an iRNA agent that targets apoB, beta-catemn or glucose-6-phosphatase iRNA 
targeting agent. 

In one embodiment, the iRNA agent has been modified, or is associated with a 
delivery agent, e.g., a delivery agent described herein, e.g., a liposome, which has been 

10 
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modified to alter distribution in favor of the liver. In one embodiment, the modification 
mediates association with a serum albumin (SA), e.g., a human serum albumin (HSA), or a 
fragment thereof. 

The iRNA agent, particularly an iRNA agent that targets apoB, beta-catenin or 
glucose-6-phosphatase RNA, can be targeted to the liver, for example by associating, e.g., 
conjugating the iRNA agent to an SA molecule, e.g., an HSA molecule, or a fragment 
thereof, hi one embodiment, the iRNA agent or composition thereof has an affinity for an 
SA, e.g., HSA, which is sufficiently high such that its levels in tlie liver are at least 10, 20, 
30, 50, or 100% greater in the presence of SA, e.g., HSA, or is such that addition of 
exogenous SA will increase delivery to the liver. These criteria can be measured, e.g., by 
testing distribution in a mouse in the presence or absence of exogenous mouse or human SA. 

The SA, e.g., HSA, targeting agent can be linlced directly, e.g., covalently or non- 
covalently, to the iRNA agent, or to another delivery or formulation modality, e.g., a 
liposome. E.g., the iRNA agents with or without a targeting moiety can be incorporated into 
a delivery modality, e.g., a liposome, with or without a targeting moiety. 

It is particularly preferred to use an iRNA conjugated to an SA, e.g., an HSA, 
molecule wherein the iRNA agent is an apoB, beta-catemn or glucose-6-phosphatase iRNA 
targeting agent. 

In another aspect, the invention features, a method for reducmg glucose-6- 
phosphatase levels in a subject, e.g., a mammal, such as a human. The method includes 
administering to a subject an iRNA agent which targets glucose-6-phosphatase. The iRNA 
agent can be a dsRNA that has a sequence that is substantially identical to a sequence of the 

glucose-6-phosphatase gene. 

In a preferred embodiment, the subject is treated with an iRNA agent which targets 
one of the sequences listed in Table 7. In a preferred embodiment it targets both sequences 
of a palindromic pair provided m Table 7. The most preferred targets are listed in 
descending order of preferrability , in other words, the more preferred targets are Hsted earlier 
in Table 7. 

In a preferred embodiment the iRNA agent will include regions, or strands, which are 
complementary to a pair in Table 7. In a preferred embodunent the iRNA agent will include 
regions complementary to the palindromic pairs of Table 7 as a duplex region. 
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In a preferred embodiment the duplex region of the IRNA agent will target a sequence 
Usted m Table 7_but will not be perfectly complementary with the target sequence, e.g., it 
will not be complementary at at least 1 base pair. Preferably it will have no more than 1, 2, 
3, 4, or 5 bases, in total, or per strand, which do not hybridize with the target sequence 

In a preferred embodiment the iRNA agent includes overhangs, e.g., 3' or 5' 
overhangs, preferably one or more 3' overhangs. Overhangs are discussed in detail 
elsewhere herein but are preferably about 2 nucleotides in length. The overhangs can be 
complementary to the gene sequences being targeted or can be other sequence. TT is a 
preferred overhang sequence. The first and second iRNA agent sequences can also be joined, 
e.g., by additional bases to form a hairpin, or by other non-base linkers. 

Table 7 refers to sequences from himian glucose-6-phosphatase. Table 8 refers to 
sequences from rat glucose-6-phosphatase. The sequences from table 8 can be used, e.g., in 
experiments with rats or cultured rat cells. 

In a preferred embodunent iRNA agent can have any architecture, e.g., architecture 
described herein. E.g., it can be incorporated into an iRNA agent having an overhang 
structure, overall length, hairpin vs. two-strand structure, as described herein. In addition, 
monomers other than naturally occurring ribonucleotides can be used in the selected iRNA 
agent. 

The iRNA that targets glucose-6-phosphatase can be administered in an amomit 
sufficient to reduce expression of glucose-6-phosphatase mRNA. 

The iRNA that targets glucose-6-phosphatase can be administered to a subject to 
inhibit hepatic glucose production, for the treatment of glucose-metabolism-related disorders, 
such as diabetes, e.g., type-2-diabetes mellitus. The iRNA agent can be administered to an 
individual at risk for the disorder to delay onset of the disorder or a symptom of the disorder. 

In other embodiments, iRNA agents having sequence similarity to tlie following 
genes can also be used to inhibit hepatic glucose production. These otlier genes include 
"forkhead homologue in rhabdomyosaixoma (FKHR); glucagon; glucagon receptor; 
glycogen phosphoiylase; PPAR-Gamma Coactivator (PGC-1); Fructose- 1,6-bisphosphatase; 
glucose-6-phosphate locator; glucokinase inhibitory regulatory protein; and 
pliosphoenolpyruvate carboxykinase (PEPCK). 
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In one embodiment, the iRNA agent can be targeted to the liver, and RNA expression 
levels of the targeted genes are decreased in the liver following administration of the iRNA 

agent. \ 

The iRNA agent can be one described herein, and can be a dsRNA that has a 

sequence that is substantially identical to a sequence of a target gene. The iRNA can be less 
than 30 nucleotides in length, e.g., 21-23 nucleotides. Preferably, the iRNA is 21 nucleotides 
in lengdi. In one embodhnent, the iRNA is 21 nucleotides in length, and the duplex region of 
the iRNA is 19 nucleotides. In another embodiment, the iRNA is greater than 30 nucleotides 
in length 

In anotlier aspect, the invention featui'es a method for reducing beta-catenin levels in 
a subject, e.g., a mammal, such as a human. The method includes administering to a subject 
an iRNA agent that targets beta-catenin. The iRNA agent can be one described herein, and 
can be a dsRNA that has a sequence that is substantially identical to a sequence of the beta- 
catenin gene. The iRNA can be less than 30 nucleotides in length, e.g., 21-23 nucleotides. 
Preferably, the iRNA is 21 nucleotides in length. In one embodiment, the iRNA is 21 
nucleotides in length, and the duplex region of the iRNA is 19 nucleotides, hi another 
embodiment, the iRNA is greater than 30 nucleotides in length. 

In a preferred embodiment, the subject is treated with an iRNA agent which targets 
one of the sequences hsted in Table 9. In a preferred embodiment it targets both sequences 
of a palindromic pair provided in Table 9. The most prefen*ed targets are listed in 
descending order of pref eiTability, in other words, the more preferred targets are listed earlier 
in Table 9. 

In a preferred embodiment, the subject is treated with an iRNA agent which targets 
one of tlie sequences Usted in Table 9. In a preferred embodiment it targets both sequences 
of a palindromic pair provided in Table 9. The most preferred targets are Hsted in 
descendmg order of preferrability , in other words, the more preferred targets are hsted earlier 
in Table 9. 

In a preferred embodiment the iRNA agent will include regions, or strands, which are 
complementary to a pair in Table 9. In a preferred embodiment the iRNA agent will include 
regions complementary to the palindromic pairs of Table 9as a duplex region. 
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In a preferred embodiment the duplex region of the iRNA agent will target a sequence 
listed in Table 9 but will not be perfectly complementary with the target sequence, e.g., it 
will not be complementary at at least 1 base pair. Preferably it will have no more than 1, 2, 
3, 4, or 5 bases, in total, or per strand, which do not hybridize with the target sequence 

In a preferred embodiment tlie iRNA agent includes overhangs, e.g., 3' or 5' 
overhangs, preferably one or more 3' overhangs. Overhangs are discussed in detail 
elsewhere herein but are preferably about 2 nucleotides m length. The overhangs can be 
complementary to the gene sequences being targeted or can be other sequence. TT is a 
preferred overhang sequence. The first and second iRNA agent sequences can also be joined, 
e.g., by additional bases to form a hairpin, or by other non-base linkers. 

The iRNA agent that targets beta-catenin can be administered in an amount sufficient 
to reduce expression of beta-catenin niRNA. In one embodiment, the iRNA agent is 
administered in an amount sufficient to reduce expression of beta-catenin protein (e.g., by at 

least 2%, 4%, 6%, 10%, 15%, 20%). 

The iRNA agent that targets beta-catenin can be administered to a subject, wherein 
the subject is suffering from a disorder characterized by unwanted cellular proliferation in the 
liver or of liver tissue, e.g., metastatic tissue originating from the liver. Examples include , a 
benign or malignant disorder, e.g., a cancer, e.g., a hepatocellular carcinoma (HCC), hepatic 

metastasis, or hepatoblastoma. 

The iRNA agent can be administered to an individual at risk for the disorder to delay 

onset of the disorder or a symptom of the disorder 

In a preferred embodiment the iRNA agent is administered repeatedly. 
Administration of an iRNA agent can be carried out over a range of time periods. It can be 
administered daily, once every few days, weekly, or monthly. The timing of administration 
can vary from patient to patient, depending on such factors as the severity of a patient's 
symptoms. For example, an effective dose of an iRNA agent can be administered to a patient 
once a montli for an indefinite period of time, or imtil the patient no longer requires therapy. 
In addition, sustained release compositions containing an iRNA agent can be used to 
maintain a relatively constant dosage in the patient's blood. 

In one embodiment, the iRNA agent can be targeted to the liver, and beta-catenin 
expression level are decreased in tlie liver following administration of the beta-catenin iRNA 
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agent. For example, the iRNA agent can be complexed with a moiety that targets the liver, 
e.g., an antibody or ligand that binds a receptor on the liver. 

In another aspect, the invention provides methods to treat liver disorders, e.g., 
disorders characterized by unwanted cell proliferation, hematological disorders, disorders 
characterized by inflammation disorders, and metabolic or viral diseases or disorders of the 
liver. A proliferation disorder of the liver can be, for example, a benign or malignant 
disorder, e.g., a cancer, e.g, a hepatocellular carcinoma (HCC), hepatic metastasis, or 
hepatoblastoma. A hepatic hematology or uiflammation disorder can be a disorder involving 
clotting factors, a complement-mediated inflammation or a fibrosis, for example. Metabolic 
diseases of the liver can include dyslipidemias, and irregularities in glucose regulation. Viral 
diseases of tlie liver can include hepatitis C or hepatitis B. In one embodiment, a liver 
disorder is treated by administering one or more iRNA agents that have a sequence that is 
substantially identical to a sequence in a gene involved in the liver disorder. 

In one embodiment an iRNA agent to treat a liver disorder has a sequence which is 
substantially identical to a sequence of the beta-catenin or c-jun gene. In another 
embodiment, such as for the treatment of hepatitis C or hepatitis B, the iRNA agent can have 
a sequence that is substantially identical to a sequence of a gene of the hepatitis C vkus or the 
hepatitis B vkus, respectively. For example, the iRNA agent can target the 5' core region of 
HCV. This region lies just downstream of the ribosomal toe-print straddling the initiator 
methionine. Alternatively, an iRNA agent of the invention can target any ope of the 
nonstructural proteins of HCV: NS3, 4A, 4B, 5A, or 5B. For the treatment of hepatitis B, an 
iRNA agent can target the proteui X (HBx) gene, for example. 

In a preferred embodiment, the subject is treated with an iRNA agent which targets 
one of the sequences Usted in Table 10. hi a preferred embodiment it targets botli sequences 
of a palindromic pair provided hi Table 1 0. The most preferred targets are listed in 
descending order of prefeiTability, in other words, the more preferred targets are listed earlier 
in Table 10. 

In a preferred embodiment the iRNA agent will include regions, or strands, which are 
complementary to a pair in Table 10. In a preferred embodiment the iRNA agent will 
include regions complementary to the paluidromic pairs of Table 10 as a duplex region. 



15 



wo 2004/080406 



PCT/US2004/007070 



In a preferred embodiment the duplex region of the iRNA agent will target a sequence 
listed in Table 10, but will not be perfectly complementary with the target sequence, e.g., it 
will not be complementary at at least 1 base pair. Preferably it will have no more than 1, 2, 
3, 4, or 5 bases, in total, or per strand, which do not hybridize with the target sequence 

In a preferred embodiment the iRNA agent includes overhangs, e.g., 3' or 5' 
overhangs, preferably one or more 3' overhangs. Overhangs are discussed m detail 
elsewhere herem but are preferably about 2 nucleotides in length. The overhangs can be 
complementary to the gene sequences being targeted or can be other sequence. TT is a 
preferred overhang sequence. The first and second iRNA agent sequences can also be joined, 
e.g., by additional bases to form a hairpin, or by other non-base linkers. 

In another aspect, an iRNA agent can be administered to modulate blood clotting, 
e.g., to reduce the tendency to form a blood clot. In a preferred embodiment the iRNA agent 
targets Factor V expression, preferably in the liver. One or more iRNA agents can be used to 
target a wild type allele, a mutant allele, e.g., the Leiden Factor V allele, or both. Such 
administration can be used to treat or prevent venous thrombosis, e.g., deep vein thrombosis 
or pulmonary embolism, or another disorder caused by elevated or otherwise unwanted 
expression of Factor V, in, e.g., the liver. In one embodhnent the iRNA agent can treat a 
subject, e.g., a human who has Factor V Leiden or other genetic trait associated witli an 
miwanted tendency to form blood clots. 

in a preferred embodiment administration of an iRNA agent which targets Factor V is 
with the administration of a second treatment, e.g, a treatment which reduces the tendency of 
the blood to clot, e.g., the administration of heparin or of a low molecular weight heparin. 

In one embodiment, the iRNA agent that targets Factor V can be used as a 
prophylaxis in patients, e.g., patients with Factor V Leiden, who are placed at risk for a 
thrombosis, e.g., those about to undergo surgery, in particular those about to undergo high- 
risk surgical procedures known to be associated with formation of venous thrombosis, those 
about to imdergo a prolonged period of relative inactivity, e.g., on a motor vehicle, train or 
airplane flight, e.g., a flight or other trip lasting more than three or five hours. Such a 
treatment can be an adjunct to the therapeutic use of low molecular weight (LMW) heparin 
prophylaxis. 
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In another embodiment, the iRNA agent that targets Factor V can be administered to 
patients with Factor V Leiden to treat deep vein thrombosis (DVT) or pulmonary embolism 
(PE). Such a treatment can be an adjunct to (or can replace) therapeutic uses of heparin or 
coxmiadm. The treatment can be administered by inhalation or generally by pulmonary 
routes. 

In a preferred embodiment, an iRNA agent administered to treat a liver disorder is 
targeted to the liver. For example, liie iRNA agent can be complexed with a targeting 
moiety, e.g., an antibody or ligand that recognizes a liver-specific receptor. 

The invention also includes preparations, including substantially pure or 
pharmaceutically acceptable preparations of iRNA agents which silence any of the genes 
discussed herein and in particular for any of apoB- 1 00, glucose-6-phosphatase, beta-catenin, 
factor V, or any of the HVC genes discussed herein. 

The methods and compositions of the invention, e.g., the methods and compositions 
to treat diseases and disorders of the liver described herein, can be used with any of the iRNA 
agents described. In addition, tlie methods and compositions of the invention can be used for 
the Ixeatment of any disease or disorder described herein, and for the treatment of any 
subject, e.g., any animal, any mammal, such as any human. 

In another aspect, the invention features, a method of selecting two sequences or 
strands for use in an iRNA agent. The method includes: 

providing a first candidate sequence and a second candidate sequence; 
determining the value of a parameter which is a function of the number of 
palindromic pairs between the first and second sequence, wherein a palindromic pair is a 
nucleotide on said first sequence which, when the sequences are aUgned in anti-parallel 
orientation, will hybridize witli a nucleotide on said second sequence; 

comparing the number with a predeteimined reference value, and if tlie number has 
a predetermined relationship with tlie reference, e.g., if it is the same or greater, selecting the 
sequences for use in an iRNA agent. In most cases each of the two sequences will be 
completely complementary with a target sequence (though as described elsewhere herein that 
may not always be the case, there may not be perfect complementarity with one or both of 
the target sequences) and will have sufficient complementarity with each other to form a 
duplex. The parameter can be derived e.g., by directly determining the number of 
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palindromic pairs, e.g., by inspection or by the use of a computer program which compares 
or analyses sequence. The parameter can also be determined less directly, and include e.g., 
calculation of or measurement of the Tm or other value related to the ftee energy of 
association or dissociation of a duplex. 

In a preferred embodmient the determination can be performed on a target sequence, 
e.g., a genomic sequence. In such embodiments the selected sequence is converted to its 

complement in the iRNA agent. 

In a preferred embodmient the fnst and second sequences are selected from the 
sequence of a single target gene. In other embodiments the first sequence is selected from 
the sequence of a first target gene and the second sequence is selected from the target of a 
second target gene. 

In a preferred embodmient the method includes comparing blocks of sequence, e.g., 
blocks which are between 15 and 25 nucleotides in length, and preferably 19, 20, or 21, and 
most preferably 19 nucleotides m length, to determine if they are suitable for use, e.g., if they 
possess sufficient palindromic pairs. 

In a preferred embodiment the first and second sequences are divided into a plurality 
of regions, e.g., termmal regions and a middle region disposed between the terminal regions 
and where in the reference value, or the predetermined relationship to the reference value, is 
different for at least two regions. E.g., the first and second sequences, when aligned in anti- 
parallel orientation, are divided into termmal regions each of a selected number of base pairs, 
e.g., 2, 3, 4, 5, or 6, and a middle region, and the reference value for the termmal regions is 
higher than for the middle regions. In other words, a higher number or proportion of 
palindromic pairs is required in the terminal regions. 

In a preferred embodiment the furst and second sequences are gene sequences thus the 
complements of the sequences will be used in a iRNA agent. 

In a preferred embodiment hybridize means a classical Watson-Crick pairing. In other 
embodiments hybridize can include non- Watson-Crick paring, e.g., parings seen in micro 
RNA precursors. 

In a preferred embodiment the method includes the addition of nucleotides to form 
overhangs, e.g., 3' or 5' overhangs, preferably one or more 3' overhangs. Overhangs are 
discussed in detail elsewhere herein but are preferably about 2 nucleotides in length. The 
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overhangs can be complementary to the gene sequences being targeted or can be other 
sequence. TT is a preferred overhang sequence. The first and second iRNA agent sequences 
can also be joined , e.g., by additional bases to form a hairpin, or by other non-base linkers. 

In a preferred embodunent the method is used to select all or part of a iRNA agent. 
The selected sequences can be incorporated into an iRNA agent having any architecture, e.g., 
an architecture described herein. E.g., it can be incorporated into an iRNA agent having an 
overhang structure, overall length, hairpin vs. two-strand structure, as described herein. In 
addition, monomers other than naturally occurring ribonucleotides can be used in the selected 
iRNA agent. 

Preferred iRNA agents of this method will target genes expressed in the liver, e.g., 
one of the genes disclosed herem, e.g., apo B, Beta catenin, an HVC gene, or glucose 6 
phosphatase. 

In another aspect, the invention features, an iRNA agent, determined, made, or 
selected by a method described herem. 

The methods and compositions of the invention, e.g., the methods and iRNA 
compositions to treat liver-based diseases described herein, can be used with any dosage 
and/or formulation described herein, as well as with any route of administration described 
herein. 

The invention also provides for tlie use of an iRNA agent which includes monomers 
which can form other than a canonical Watson-Crick pairing with another monomer, e.g., a 

monomer on another strand. 

The use of "other than canonical Watson-Crick pairing" between monomers of a 
duplex can be used to control, often to promote, melting of all or part of a duplex. The iRNA 
agent can mclude a monomer at a selected or constrained position that results in a first level 
of stability in the iRNA agent duplex (e.g., between the two separate molecules of a double 
str^ded iRNA agent) and a second level of stability in a duplex between a sequence of an 
iRNA agent and another sequence molecule, e.g., a target or off-target sequence m a subject. 
In some cases the second duplex has a relatively greater level of stability, e.g., in a duplex 
between an anti-sense sequence of an iRNA agent and a target mRNA. In this case one or 
more of tlie monomers, the position of the monomers in the iRNA agent, and the target 
sequence (sometimes referred to herein as the selection or constraint parameters), are 
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selected such that the iRNA agent duplex is has a comparatively lower ftee energy of 
association (wliich while not wishing to be bo\md by mechanism or theory, is believed to 
contribute to efficacy by promoting disassociation of the duplex iRNA agent in the context of 
the RISC) while the duplex formed between an anti-sense targeting sequence and its target 
sequence, has a relatively higher ftee energy of association (which while not wishing to be 
bound by mechanism or theory, is believed to contribute to efficacy by promoting association 
of the anti-sense sequence and the target RNA). 

In other cases the second duplex has a relatively lower level of stability, e.g., in a 
duplex between a sense sequence of an iRNA agent and an off-target mRNA. In this case 
one or more of the monomers, the position of the monomers in the iRNA agent, and an off- 
target sequence, are selected such that the iRNA agent duplex is has a comparatively higher 
free energy of association while the duplex formed between a sense targeting sequence and 
its off-target sequence, has a relatively lower free energy of association (which while not 
wishing to be bound by mechanism or theory, is believed to reduce the level of off-target 
silencing by contribute to efficacy by promoting disassociation of the duplex formed by the 
sense strand and the off-target sequence). 

Thus, inherent in the structure of the iRNA agent is the property of having a first 
stability for the intra-iRNA agent duplex and a second stability for a duplex formed between 
a sequence from the iRNA agent and another RNA, e.g., a target mRNA. As discussed 
above, this can be accomplished by judicious selection of one or more of the monomers at a 
selected or constrained position, tlie selection of the position in the duplex to place the 
selected or constrained position, and selection of the sequence of a target sequence (e.g., the 
particular region of a target gene which is to be targeted). The iRNA agent sequences which 
satisfy these requirements are sometimes referred herein as constrained sequences. Exercise 
of the constraint or selection parameters can be, e.g., by inspection, or by computer assisted 
methods. Exercise of the parameters can result in selection of a target sequence and of 
particular monomers to give a desired result in terms of the stability, or relative stability, of a 
duplex. 

Thus, in one aspect, the invention features, an iRNA agent which includes: a first 
sequence which targets a first target region and a second sequence which targets a second 
target region. The first and second sequences have sufficient complementarity to each other 
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to hybridize, e.g., under physiological conditions, e.g., under physiological conditions but not 
in contact with a helicase or other unwinding enzyme. In a duplex region of the iRNA agent, 
at a selected or constramed position, the first target region has a first monomer, and the 
second target region has a second monomer. The first and second monomers occupy 
complementary or corresponding positions. One, and preferably both monomers are selected 
such that the stability of the pairing of the monomers contribute to a duplex between the first 
and second sequence will differ form tlie stability of tlie pairing between the first or second 
sequence with a target sequence. 

Usually, the monomers will be selected (selection of the target sequence may be 
required as well) such that they form a pairing in the iRNA agent duplex which has a lower 
free energy of dissociation, and a lower Tm, than will be possessed by the paring of the 
monomer with its complementary monomer in a duplex between the iRNA agent sequence 

and a target RNA duplex. 

The constramt placed upon the monomers can be applied at a selected site or at more 
than one selected site. By way of example, the constraint can be applied at more than 1 , but 
less than 3, 4, 5, 6, or 7 sites in an iRNA agent duplex. 

A constrained or selected site can be present at a mmiber of positions in the iRNA 
agent duplex. E.g., a constrained or selected site can be present withm 3, 4, 5, or 6 positions 
from either end, 3' or 5' of a duplexed sequence. A constrained or selected site can be 
present in the middle of the duplex region, e.g., it can be more than 3, 4, 5, or 6, positions 

firom the end of a duplexed region. 

The iRNA agent can be selected to target a broad spectrum of genes, including any of 

the genes described herein. 

hi a preferred embodiment the iRNA agent has an architecture (architecture refers to 
one or more of overall length, length of a duplex region, the presence, number, location, or 
length of overhangs, sing strand versus double strand form) described herein. 

E.g., the iRNA agent can be less than 30 nucleotides in length, e.g., 21-23 
nucleotides. Preferably, the iRNA is 21 nucleotides in lengtli and there is a duplex region of 
about 19 pairs. In one embodiment, the iRNA is 21 nucleotides in lengtli, and the duplex 
region of tlie iRNA is 19 nucleotides. In another embodhnent, the iRNA is greater than 30 
nucleotides in length. 
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In some embodiment the duplex region of the iRNA agent will have, mismatches, in 
addition to the selected or constrained site or sites. Preferably it will have no more than 1, 2, 
3, 4, or 5 bases, which do not form canonical Watson-Crick pairs or which do not hybridize. 
Overhangs are discussed in detail elsewhere herein but are preferably about 2 nucleotides in 
length. The overhangs can be complementary to the gene sequences being targeted or can be 
other sequence. TT is a preferred overhang sequence. The first and second iRNA agent 
sequences can also be joined, e.g., by additional bases to fonn a hairpin, or by other non-base 
linkers. 

The monomers can be selected such that: first and second monomers are naturally 
occurring ribonucleotides, or modified ribonucleotides having naturally occurring bases, and 
when occupying complementary sites either do not pair and have no substantial level of H- 
bonding, or form a non canonical Watson-Crick pairing and fomi a non-canonical pattern of 
H bonding, which usually have a lower free energy of dissociation than seen in a canonical 
Watson-Crick pairing, or otiierwise pair to give a free energy of association which is less 
than that of a preselected value or is less, e.g., tlian that of a canonical pairing. When one (or 
both) of the iRNA agent sequences duplexes with a target, the first (or second) monomer 
forms a canonical Watson-Crick pairing vdth the base in the complementary position on the 
target, or forms a non canonical Watson-Crick pairing having a higher free energy of 
dissociation and a higher Tm than seen in the paring in tiie iRNA agent. The classical 
Watson-Crick parings are as follows: A-T, G-C, and A-U. Non-canonical Watson-Crick 
pairings are known in the art and can include, U-U, G-G, G-Atrans, G-Acis, and GU. 

The monomer in one or both of the sequences is selected such that, it does not pair, or 
forms a pair with its corresponding monomer in the other sequence which minimizes stability 
(e.g., the H bonding formed between the monomer at tlie selected site in the one sequence 
and its monomer at the corresponding site in the other sequence are less stable than the H 
bonds formed by the monomer one (or both) of the sequences with the respective target 
sequence. The monomer in one or both strands is also chosen to promote stability in one or 
both of the duplexes made by a strand and its target sequence. E.g., one or more of the 
monomers and the target sequences are selected such that at the selected or constrained 
position, tliere is are no H bonds formed, or a non canonical pairing is formed in tiie iRNA 
agent duplex, or otherwise tliey otlierwise pair to give a free energy of association which is 
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less than that of a preselected value or is less, e.g., than that of a canonical pairing, but when 
one ( or both) sequences foiin a duplex with the respective target, the pairing at the selected 
or constrained site is a canonical Watson-Crick pairmg. 

The inclusion of such a monomers will have one or more of the following effects: it 
will destabilize the iRNA agent duplex, it will destabilize interactions between the sense 
sequence and unintended target sequences, sometimes referred to as off-target sequences, and 
duplex interactions between the a sequence and the intended target will not be destabilized. 

By way of example: 

the monomer at the selected site in the first sequence includes an A (or a modified 
base which pairs with T), and the monomer in at the selected position in the second sequence 
is chosen from a monomer which will not pair or which will form a non-canonical pairing, 
e.g., G. These will be useful in applications wherein the target sequence for the first 
sequence has a T at the selected position. In embodiments where both target duplexes are 
stabilized it is useful wherein the target sequence for the second strand has a monomer which 
will form a canonical Watson-Crick pairing with the monomer selected for the selected 

position in the second strand. 

the monomer at the selected site in the first sequence mcludes U (or a modified base 
which pairs with A), and the monomer in at the selected position in the second sequence is 
chosen from a monomer wliich will not pair or which will form a non-canonical pairing, e.g., 
U or G. These will be useful in applications wherein the target sequence for the first 
sequence has a T at the selected position. In embodiments where both target duplexes are 
stabilized it is useful wherein the target sequence for the second strand has a monomer which 
will form a canonical Watson-Crick pairing with the monomer selected for the selected 

position in the second strand. 

The monomer at the selected site in the first sequence includes a G (or a modified 
base which pairs with C), and the monomer in at the selected position in the second sequence 
is chosen from a monomer which will not pair or which v^U form a non-canonical pairing, 
e.g., G, Acis, Atrans, or U, These vAll be useful in applications wherein the target sequence 
for the first sequence has a T at the selected position. In embodiments where both target 
duplexes are stabilized it is use&l wherein the target sequence for the second strand has a 
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monomer which will fomi a canonical Watson-Crick pairing with the monomer selected for 
the selected position in the second strand. 

The monomer at the selected site in the first sequence includes a C (or a modified 
base which pairs with G), and the monomer in at the selected position in the second sequence 
is chosen a monomer which will not pair or which will form a non-canonical pairing. These 
will be use&l in applications wherein the target sequence for the first sequence has a T at the 
selected position. In embodiments where both target duplexes are stabilized it is useful 
wherein the target sequence for the second strand has a monomer which will form a 
canonical Watson-Crick pairing with the monomer selected for the selected position in the 
second strand. 

In another embodiment a non-naturally occurring or modified monomer or monomers 
are chosen such that when a non-naturally occurring or modified monomer occupies a 
positions at the selected or constrained position in an iRNA agent they exhibit a first free 
energy of dissociation and when one (or both) of them pairs with a naturally occurring 
monomer, the pair exhibits a second free energy of dissociation, which is usually higher than 
that of the pairing of the first and second monomers. E.g., when the first and second 
monomers occupy complementary positions they either do not pair and have no substantial 
level of H-bonding, or form a weaker bond than one of them would form with a naturally 
occurring monomer, and reduce the stability of that duplex, but when the duplex dissociates 
at least one of the strands will form a duplex with a target in which the selected monomer 
will promote stability, e.g., the monomer will form a more stable pair with a naturally 
occurring monomer in the target sequence than the pairmg it formed in the iRNA agent. 

An example of such a pairmg is 2-amino A and either of a 2-thio pyrimidme analog 
of U or T. 

When placed in complementary positions of the iRNA agent these monomers will 
pair very poorly and will minimize stability. However, a duplex is formed between 2 amino 
A and the U of a naturally occurring tai'get, or a duplex is between 2-thio U and the A of a 
naturally occurring target or 2-thio T and the A of a naturally occurring target will have a 
relatively higher free energy of dissociation and be more stable. This is shown in the FIG. 1. 

The pair shown in FIG. 1 (the 2-amino A and the 2-s U and T) is exemplary. In 
another embodiment, the monomer at the selected position in the sense strand can be a 
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universal pairing moiety. A universal pairing agent will form some level of H bonding with 
more than one and preferably all other naturally occmTing monomers. An example of a 
miiversal pairing moiety is a monomer which includes 3-nitro pyrrole. (Examples of other 
candidate universal base analogs can be found in the art, e.g., in Loakes, 2001, NAR 29: 
2437-2447, hereby incorporated by reference. Examples can also be found in the section on 
Universal Bases below.) In these cases the monomer at the corresponding position of the 
anti-sense strand can be chosen for its ability to form a duplex with the target and can 

include, e.g., A, U, G, or C. 

In another aspect, the invention features, an iRNA agent which includes: a sense 
sequence, which preferably does not target a sequence in a subject, and an anti-sense 
sequence, which targets a target gene in a subject. The sense and anti-sense sequences have 
sufficient complementarity to each other to hybridize hybridize, e.g., under physiological 
conditions, e.g., under physiological conditions but not in contact with a helicase or other 
unwinding enzyme. In a duplex region of the iRNA agent, at a selected or constrained 
position, the monomers are selected such that: ^ 

the monomer in the sense sequence is selected such that, it does not pair, or forms a 
pair with its corresponding monomer hi the anti-sense strand which minimizes stability (e.g., 
the H bonding fonned between the monomer at the selected site in the sense strand and its 
monomer at the corresponding site in the anti-sense strand are less stable than the H bonds 
formed by the monomer of the anti-sense sequence and its canonical Watson-Crick partner 
or, if the monomer in the anti-sense strand includes a modified base, the natural analog of the 
modified base and its canonical Watson-Crick partner); 

the monomer is in the corresponding position in the anti-sense strand is selected such 
that it maximizes the stability of a duplex it forms witli the target sequence, e.g., it forms a 
canonical Watson-Crick paring with the monomer m the corresponding position on the target 
stand; 

optionally, the monomer in the sense sequence is selected such that, it does not pair, 
or forms a pair with its corresponding monomer in the anti-sense strand which minimizes 
stability with an off-target sequence. 

The inclusion of such a monomers will have one or more of the following effects: it 
will destabilize the iRNA agent duplex, it will destabilize mteractions between the sense 
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sequence and imintended target sequences, sometimes referred to as off-target sequences, and 
duplex interactions between the anti-sense strand and the intended target will not be 
destabilized. 

The constraint placed upon the monomers can be applied at a selected site or at more 
than one selected site. By way of example, the constraint can be applied at more than 1, but 
less than 3, 4, 5, 6 ^ or 7 sites in an iRNA agent duplex. 

A constrained or selected site can be present at a number of positions in the iRNA 
agent duplex. E.g., a constrained or selected site can be present within 3, 4, 5, or 6 positions 
from either end, 3' or 5' of a duplexed sequence. A constrained or selected site can be 
present in the middle of the duplex region^ e.g., it can be more than 3, 4, 5, or 6, positions 
from the end of a duplexed region. 

The iRNA agent can be selected to target a broad spectrum of genes, including any of 

the genes described herein. 

In a preferred embodiment the iRNA agent has an architecture (architecture refers to 
one or more of overall length, length of a duplex region, the presence, number, location, or 
length of overhangs, sing strand versus double strand form) described herein. 

E.g., the iRNA agent can be less than 30 nucleotides in length, e.g., 21-23 
nucleotides. Preferably, the iRNA is 21 nucleotides in length and there is a duplex region of 
about 19 pairs. In one embodiment, the iRNA is 21 nucleotides in length, and the duplex 
region of the iRNA is 19 nucleotides. In another embodiment, the iRNA is greater than 30 

nucleotides in length. 

In some embodiment the duplex region of the iRNA agent will have, mismatches, in 
addition to the selected or constrained site or sites. Preferably it will have no more than 1, 2, 
3, 4, or 5 bases, which do not form canonical Watson-Crick pairs or which do not hybridize. 
Overhangs are discussed in detail elsewhere herein but are preferably about 2 nucleotides in 
length. The overhangs can be complementary to the gene sequences being targeted or can be 
other sequence. TT is a preferred overhang sequence. The first and second iRNA agent 
sequences can also be joined, e.g., by additional bases to form a hairpin, or by other non-base 
linkers. 

One or more selection or constraint parameters can be exercised such that: monomers 
at the selected site in the sense and anti-sense sequences are both naturally occurring 
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ribonucleotides, or modified ribonucleotides having naturally occurring bases, and when 
occupying complementary sites in the iRNA agent duplex either do not pair and have no 
substantial level of H-bonding, or form a non-canonical Watson-Crick pairing and thus form 
a non-canonical pattern of H bonding, which generally have a lower free energy of 
dissociation than seen in a Watson-Crick pairing, or otherwise pair to give a free energy of 
association which is less than that of a preselected value or is less, e.g., than that of a 
canonical pairing. When one, usually the anti-sense sequence of the iRNA agent sequences 
forms a duplex with another sequence, generally a sequence in the subject, and generally a 
target sequence, the monomer forms a classic Watson-Crick pairing with the base in the 
complementary position on the target, or forms a non-canonical Watson-Crick pairing havmg 
a higher free energy of dissociation and a higher Tm than seen in the paring in the iRNA 
agent. Optionally, when the other sequence of the iRNA agent, usually the sense sequences 
forms a duplex with another sequence, generally a sequence in the subject, and generally an 
off-target sequence, the monomer fails to forms a canonical Watson-Crick pairing with the 
base in the complementary position on the off target sequence, e.g., it forms or forms a non- 
canonical Watson-Crick pairing having a lower free energy of dissociation and a lower Tm. 
By way of example : 

the monomer at the selected site in the anti-sense stand includes an A (or a modified 
base which pairs with T), the corresponding monomer in the target is a T, and tlie sense 
strand is chosen from a base which will not pair or which will form a noncanonical pair, e.g.,, 

G; 

tlie monomer at the selected site in the anti-sense stand includes a U (or a modified 
base which pairs with A), the corresponding monomer in the target is an A, and the sense 
strand is chosen from a monomer which will not pair or which will form a non-canonical 

pairing, e.g., U or G; 

the monomer at die selected site in the anti-sense stand includes a C (or a modified 
base which pairs with G), the corresponding monomer in the target is a G, and the sense 
strand is chosen a monomer which will not pair or which will form a non-canonical pairing, 

e.g., G, Acis, Atrans, or U; or 

the monomer at the selected site iu the anti-sense stand includes a G (or a modified 

base which pairs witli C), the corresponding monomer in the target is a C, and the sense 
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Strand is chosen from a monomer which will not pair or which will form a non-canonical 
pairing. 

In another embodiment a non-naturally occurring or modified monomer or monomers 
is chosen such that when it occupies complementary a position in an iKNA agent they exhibit 
a first free energy of dissociation and when one (or both) of them pairs with a naturally 
occun-ing monomer, the pah exhibits a second free energy of dissociation, which is usually 
higher than that of the pairing of the first and second monomers. E.g., when the first and 
second monomers occupy complementary positions they either do not pah and liave no 
substantial level of H-bonding, or form a weaker bond than one of them would form with a 
naturally occurring monomer, and reduce the stability of that duplex, but when the duplex 
dissociates at least one of the strands will form a duplex with a target m which the selected 
monomer will promote stability, e.g., tlie monomer will form a more stable pair with a 
naturally occurring monomer in the target sequence than the pairing it formed in tlie iRNA 
agent. 

An example of such a pairing is 2-amino A and either of a 2-thio pyrimidine analog 
of U or T. As is discussed above, when placed in complementary positions of the iRNA 
agent these monomers will pah very poorly and will minimize stability. However, a duplex 
is formed between 2 amino A and the U of a naturally occurring target, or a duplex is formed 
between 2-thio U and the A of a naturally occurring target or 2-thio T and the A of a 
natm-ally occurring target will have a relatively higher free energy of dissociation and be 
more stable. 

The monomer at the selected position in the sense strand can be a imiversal pairing 
moiety. A universal pairing agent will form some level of H bonding with more than one and 
preferably all other naturally occurring monomers. An examples of a xmiversal pairing 
moiety is a monomer which mcludes 3-nitro pyrrole. Examples of other candidate universal 
base analogs can be found in the art, e.g., mLoakes, 2001, NAR 29: 2437-2447, hereby 
incorporated by reference. In these cases the monomer at the corresponding position of the 
anti-sense strand can be chosen for its ability to form a duplex witii the target and can 

include, e.g.. A, U, G, or C. 

In another aspect, the invention features, an iRNA agent which includes: 
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a sense sequence, which preferably does not target a sequence in a subject, and an anti-sense 
sequence, which targets a plurality of target sequences in a subject, wherein the targets differ 
in sequence at only 1 or a small number, e.g., no more tlian 5, 4, 3 or 2 positions. The sense 
and anti-sense sequences have sufficient complementarity to each other to hybridize, e.g., 
under physiological conditions, e.g., under physiological conditions but not in contact with a 
helicase or other unwinding enzyme. In the sequence of the anti-sense strand of the iRNA 
agent is selected such that at one, some, or all of the positions which correspond to positions 
that differ in sequence between the target sequences, the anti-sense strand will include a 
monomer which will form H-bonds with at least two different target sequences. In a 
preferred example the anti-sense sequence will include a xmiversal or promiscuous monomer, 
e.g., a monomer which includes 5-nitro pyrrole, 2-amino A, 2-thio U or 2-thio T, or other 
universal base referred to herein. 

In a preferred embodiment the iRNA agent targets repeated sequences (which differ 
at only one or a small number of positions from each other) in a single gene, a plurality of 
genes, or a viral genome, e.g., the HCV genome. 

An embodiment is illustrated in the FIGs. 2 and 3. 

In anotlier aspect, the invention features, determining, e.g., by measurement or 
calculation, the stability of a pairing between monomers at a selected or constrained position 
in the iRNA agent duplex, and preferably determining the stability for tlie corresponding 
pairing in a duplex between a sequence form tlie iRNA agent and another RNA, e.g., a target 
sequence. The determinations can be compared. An iRNA agent thus analyzed can be used 
in the development of a furtiier modified iRNA agent or can be administered to a subject. 
This analysis can be performed successively to refine or design optimized iRNA agents. 

In another aspect, tlie invention features, a kit which includes one or more of the 
following an iRNA described herehi, a sterile container in which the iRNA agent is 
disclosed, and instructions for use. 

In another aspect, the invention features, an iRNA agent containing a constrained 
sequence made by a method described herein. The iRNA agent can target one or more of the 

genes referred to herein. 

iRNA agents having constrained or selected sites, e.g., as described herein, can be 
used in any way described herein. Accordingly, they iRNA agents having constrained or 

29 



wo 2004/080406 



PCT/US2004/007070 



selected sites, e.g., as described herein, can be used to silence a target, e.g., in any of the 
methods described herein and to target any of the genes described herein or to treat any of the 
disorders described herein. iRNA agents having constrained or selected sites, e.g., as 
described herein, can be incorporated into any of the formulations or preparations, e.g., 
pharmaceutical or sterile preparations described herein. iRNA agents having constrained or 
selected sites, e.g., as described herein, can be administered by any of the routes of 
administration described herein. 

The term "other than canonical Watson-Crick pairing" as used herein, refers to a 
pairing between a first monomer in a first sequence and a second monomer at the 
corresponding position in a second sequence of a duplex in which one or more of the 
following is true: (1) there is essentially no pairing between the two, e.g., there is no 
significant level of H bonding between the monomers or binding between the monomers 
does not contribute in any significant way to the stability of the duplex; (2) the monomers are 
a non-canonical paring of monomers having a naturally occurring bases, i.e., they are other 
than A-T, A-U, or G-C, and tliey form monomer-monomer H bonds, although generally the 
H bonding pattern formed is less strong than the bonds formed by a canonical pairing; or (3) 
at least one of the monomers includes a non-naturally occurring bases and the H bonds 
formed between the monomers is, preferably formed is less strong than the bonds formed by 
a canonical pairing, namely one or more of A-T, A-U, G-C. 

The term "off-target" as used herein, refers to a sequence other than the sequence to 

be silenced. 

Universal Bases: "wild-cards" ; shape-based complementarity 

Bi-stranded, multisite replication of a base pair between difluorotoluene and adenine: confirmation by 
'inverse' sequencing. Liu, D.; Moran, S.; Kool, E. T. Chem, Biol, 1997, 4, 919-926) 
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OH p Z 

(Importance of terminal base pair liydrogen-bonding in 3 '-end proofreading by the IClenow fragment 
of DNA polymerase L Morales, J. C; Kool, E. T. Biochemistty, 2000, 59, 2626-2632) 

(Selective and stable DNA base pairing without hydrogen bonds. Matray, T, J.; Kool, E. T. J, Am. 
Chem, Soc, 1998, 120, 6191-6192) 



F 




(Difluorotoluene, a nonpolar isostere for thymine, codes specifically and efficiently for adenine in 
DNA replication. Moran, S. Ren, R. X.-F.; Rumney IV, S.; Kool, E. T. J. Am. Chem. Soc, 1997, 119, 2056- 
2057) 
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(Structure and base pairing properties of a replicable nonpolar isostere for deoxyadenosine. Guckian, 
K. M.; Morales, J. C; Kool, E, T. J. Org. Chem., 1998, 63, 9652-9656) 

5 
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Mies PJWl 5MICS 

( 

(Universal bases for hybridization, replication and chain termination. Berger, M.; Wu. Y.; Ogawa, 
5 K.; McMinn, D. L.; Schultz, P.G.; Romesberg, F. E. Nucleic Acids Res., 2000, 28, 291 1-2914) 




(1 . Efforts toward tiie expansion of the genetic alphabet: Information storage and replication with unnatural 
hydrophobic base pairs. Ogawa, A. K.; Wu, Y.; McMinn, D. L.; Liu, J.; Schultz, P. G.; Romesberg, F. E. J 
Am. Chenu Soc, 2000, 122, 3274-3287. 2. Rational design of an unnatural base pair with increased kinetic 
selectivity. Ogawa, A. K.; Wu. Y.; Berger, M.; Schultz, P. G.; Romesberg, F. E. J. Am. Chem. Soc, 2000, 
122, 8803-8804) 
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(Efforts toward expansion of the genetic alphabet: replication of DNA with three base pairs. Tae, E. L.; 
Wu, Y.; Xia, G.; Schultz, P. G.; Romesberg, F. E. J. Am. Chem, Soc, 2001, 123, 7439-7440) 

(1. Efforts toward expansion of the genetic alphabet: Optimization of interbase hydrophobic 
interactions. Wu, Y.; Ogawa, A. K.; Berger, M.; McMinn, D. L.; Schultz, P. G.; Romesberg, F. E. J. Am. Chem. 
Soc, 2000, 122, 7621-7632. 2. Efforts toward expansion of genetic alphabet: DNA polymerase recognition of a 
highly stable, self-pairing hydrophobic base. McMinn, D. L.; Ogawa, A. K.; Wu, Y.; Liu, J.; Schultz, P. G.; 
Romesberg, F. E. J. A?7i. Chem. Soc, 1999, 121, 1 1585-1 1586) 

(A stable DNA duplex containing a non-hydrogen-bonding and non-shape complementary base 
couple: Interstrand stacking as the stability determining factor. Brotschi, C; Haberli, A.; Leumann, C, l.Angew 
Chem, Int Ed, 2001, 40, 3012-3014) 

(2,2'-Bipyridine Ligandoside: A novel building block for modifying DNA with intra-duplex metal 
complexes. Weizman, H.; Tor, Y. J. Am. Chem. Soc, 2001, 123, 3375-3376) 




OH 



(Minor groove hydration is critical to the stability of DNA duplexes. Lan, T.; McLaughlin, L. W. J. 
Am. Chem. Soc,2Qm, 122, 6512-13) 
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NO2 

iri 

HO, 



^ 



OH 

(Effect of the Universal base S-nitropyrrole on tlie selectivity of neighboring natural bases. Oliver, J. 
S.; Parker, K. A.; Suggs, J. W. Organic Lett., 2001, 5, 1977-1980. 2. Effect of the l-(2'-deoxy-p-D- 
ribofuranosyl)-3-nitropyrrol residue on the stability of DNA duplexes and triplexes. Amosova, O.; George J.; 
Fresco, J. R. Nucleic Acids Res,, 1997, 25, 1930-1934. 3. Synthesis, structure and deoxyribonucleic acid 
sequencing with a universal nucleosides: l-(2*-deoxy-p-D-riboftiranosyl)-3-nitropyrrole. Bergstrom, D. E.; 
Zhang, P.; Toma, P. H.; Andrews, P. C; Nichols. R. J, Am. Chem. Soc, 1995, 777, 1201-1209) 



( 



OH 



H 




II / ^O N 

I I 

H H 



N HnininiiiiO, 




(Model studies directed toward a general triplex DNA recognition scheme: a novel DNA base that 
binds a CG base-pair in an organic solvent. Zimmerman, S. C.; Schmitt, P. / Am. Chem. Soc, 1995, 117, 
10769-10770) 



/ — o 




DNA 



(A universal, photocleavable DNA base: nitropiperonyl 2'-deoxyriboside. J. Org. Chem., 2001, 66, 
2067-2071) 
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H 
H 

N 

(Recognition of a single guanine bulge by 2-acylaTnino-l,8-naphtliyridine. Nakatani, K.; Sando, S.; 
Saito, L J. Am. Chenh Soc, 2000, 722, 2172-2177. b. Specific binding of 2-amino-l,8-naphthyridine into single 
guanine bulge as evidenced by photo oxidation of GC doublet, Nakatani, K.; Sando, S.; Yoshida, K.; Saito, L 
Bioorg. Med. Chem. Lett, 2001, //, 335-337) 
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Other universal bases can have the following formulas: 






wherein: 
Q is N or CR'*'*; 
Q' is N or CR'*^ 
Q" isNorCR'^''; 
Q'" isNorCR'^^; 
Q*^ is N or CR^°; 



37 



.45 



46 



WO 2004/080406 PCT/US2004/007070 

R"^ is hydrogen, halo, hydroxy, nitxo, protected hydroxy, NH2, NHR^, or NR^R^ Cr 
Ce alkyl, Ce-Cio aryl, Ce-Cio heteroaryl, Ca-Cg heterocyclyl, or when taken together with R 
fonns -OCH2O-; 

R'*^ is hydrogen, halo, hydroxy, nitro, protected hydroxy, NH2, NHR**, or NR^'R^ Cr 
Ce alkyl, Ce-Cio aryl, Ce-Cio heteroaryl, C3-C8 heterocyclyl, or when taken togetlier with R'^ 
or R^^ forms -OCH2O-; 

R'*^ is hydrogen, halo, hydroxy, nitro, protected hydroxy, NH2, NHR , or NR R , Ci- 
Ce alkyl, Cg-Cio aryl, Ce-Cio heteroaryl, Ca-Cg heterocyclyl, or when taken together with R 

or R^'' forms -OCH2OS 

R'*' is hydrogen, halo, hydroxy, nitro, protected hydroxy, NH2, NHR*', or NR R% Ci- 
Ce alkyl, Cs-Cio aryl, Ce-Cio heteroaryl, C3-C8 heterocyclyl, or when talcen togetlier with R 

or R'^^ forms -0CH20-; 

R"*^ is hydrogen, halo, hydroxy, nitro, protected hydroxy, NH2, NHR^ or NR'^R", Cr 
Ce alkyl, Ce-Cio aryl, Ce-Cio heteroaryl, Ca-Cg heterocyclyl, or when taken together with R'*'^ 
forms -0CH20-; 

r49 r50^ r51^ r52^ r53^ r54^ j^57^ r58^ r59^ r60^ r61^ r62^ r63^ r64^ r65^ r66^ r67^ r68^ r69^ 

R^°, R'\ and R'^ are each independently selected firom hydrogen, halo, hydroxy, nitro, 
protected hydroxy, NH2, NHR'', or NR*^", Ci-Ce alkyl, C2-C6 alkynyl, Ce-Cio aryl, Ce-Cio 
heteroaryl, Cs-Cg heterocyclyl, NC(0)R'^ or NC(0)R°; 

R^^ is hydrogen, lialo, hydroxy, nitro, protected hydroxy, NH2, NHR , or NR R , Cr 
C6 alkyl, C2-C6 alkynyl, Ce-Cio aryl, Ce-Cio heteroaryl, Cg-Cg heterocyclyl, NC(0)R^^ or 
NC(0)R'', or when taken together with R^^ forms a fused aromatic ring which may be 

optionally substituted; 

R^^ is hydrogen, halo, hydroxy, nitro, protected hydroxy, NH2, NHR^ or NR^R^ Ci- 
Ce alkyl, C2-C6 alkynyl, Ce-Cio aryl, Ce-Cio heteroaryl, Cg-Cg heterocyclyl, NC(0)R^^ or 
NC(0)R'', or when taken together with R^^ forms a fused aromatic ring wliich may be 
optionally substituted; 

R^^ is halo, NH2, NHR^ or NR^R"; 

R^ is C1-C6 alkyl or a nitrogen protecting group; 

R^ is C1-C6 alkyl; and 
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R° is alkyl optionally substituted with halo, hydroxy, nitro, protected hydroxy, NH2, 
NHR^ or NR^'R", Ci-Ce alkyU C2-C6 alkynyl, Ce-Cio aryl, Ce-Cyo heteroaryl, Ca-Cg 
heterocyclyl, NC(0)R^^ or NC(0)R^ 

Examples of universal bases include: 
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In one aspect, the invention features methods of producing iKNA agents, e.g., sRNA 
agents, e.g. an sRNA agent described herein, having the ability to mediate RNAi. These 
iRNA agents can be formulated for administration to a subject. 

In another aspect, the invention features a method of admimstering an iRNA agent, 
e.g., a double-stranded iRNA agent, or sRNA agent, to a subject {e.g., a human subject). The 
method includes administering a unit dose of the iRNA agent, e.g., a sRNA agent, e.g., 
double stranded sRNA agent that (a) the double-stranded part is 19-25 nucleotides (nt) long, 
preferably 21-23 nt, (b) is complementaiy to a target RNA {e.g., an endogenous or patliogen 
target RNA), and, optionally, (c) includes at least one 3' overhang 1-5 nucleotide long. In 
one embodiment, the unit dose is less than 1.4 mg per kg of body weight, or less than 10, 5, 2, 
1, 0.5, 0.1, 0.05, 0.01, 0.005, 0.001, 0.0005, 0.0001, 0.00005 or 0.00001 mg per kg of 
bodyweight, and less than 200 nmole of RNA agent (e.g. about 4.4 x 10^^ copies) per kg of 
bodyweight, or less than 1500, 750, 300, 150, 75, 15, 7.5, 1.5, 0.75, 0.15, 0.075, 0.015, 
0.0075, 0.0015, 0.00075, 0.00015 mnole of RNA agent per kg of body weight. 

The defined amount can be an amount effective to treat or prevent a disease or 
disorder, e.g., a disease or disorder associated with the target RNA. Tlie unit dose, for 
example, can be administered by injection (e.g., intravenous or intramuscular), an inhaled 
dose, or a topical application. Particularly preferred dosages are less than 2, 1, or 0.1 mg/kg 
of body weight. 

In a preferred embodiment, the unit dose is administered less frequently than once a 
day, e.g., less than every 2, 4, 8 or 30 days. In anotlier embodiment, tlie mit dose is not 
administered with a frequency (e.g., not a regular frequency). For example, the imit dose 

may be administered a single time. 

In one embodiment, the effective dose is administered with other traditional 
therapeutic modalities. In one embodiment, the subject has a viral infection and the modality 
is an antiviral agent other than an iRNA E^ent, e.g., other than a double-stranded iRNA 
agent, or sRNA agent. In another embodiment, the subject has atherosclerosis and the 
effective dose of an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, is 
administered in combination with, e.g., after surgical intervention, e.g., angioplasty. 

In one embodiment, a subject is administered an initial dose and one or more 
maintenance doses of an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
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(e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof). The maintenance dose or doses are generally lower than the initial dose, 
e.g,, one-half less of the initial dose. A maintenance regimen can include treating the subject 
with a dose or doses ranging from 0.01 |xg to 1.4 mg/kg of body weight per day, e.g., 10, 1, 
0.1, 0.01, 0.001, or 0.00001 mg per kg of bodyweight per day. The maintenance doses are 
preferably administered no more than once every 5, 10, or 30 days. 

In one embodiment, the iRNA agent pharmaceutical composition includes a plurality 
of iRNA agent species. In another embodiment, the iRNA agent species has sequences that 
are non-overlapping and non-adjacent to another species with respect to a naturally occurring 
target sequence. In another embodiment, the plurality of iRNA agent species is specific for 
different naturally occurring target genes. In another embodiment, the iRNA agent is allele 
specific. 

The inventors have discovered that iRNA agents described herein can be administered 
to mammals, particularly large mammals such as nonhuman prhnates or humans in a number 
of ways. 

In one embodiment, the admmistration of the iRNA agent, e.g., a double-stranded 
iRNA agent, or sRNA agent, composition is parenteral, e.g. mtravenous (e.g., as a bolus or as 
a diffusible infusion), intradermal, intraperitoneal, intramuscular, intrathecal, intraventricular, 
intracranial, subcutaneous, transmucosal, buccal, sublingual, endoscopic, rectal, oral, vagmal, 
topical, pulmonary, intranasal, urethral or ocular. Administration can be provided by the 
subject or by another person, e.g., a health care provider. The medication can be provided in 
measured doses or in a dispenser that delivers a metered dose. Selected modes of delivery 

are discussed in more detail below. 

The invention provides methods, compositions, and kits, for rectal administration or 

delivery of iRNA agents described herein. 

Accordingly, an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
(e.g., a precursor, e.g., a larger iRNA agent which can be processed mto a sRNA agent, or a 
DNA which encodes a an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
or precursor thereof) described herein, e.g., a therapeutically effective amount of a iRNA 
agent described herein, e.g., a iRNA agent having a double stranded region of less than 40, 
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and preferably less than 30 nucleotides and having one or two 1-3 nucleotide single strand 3* 
overhangs can be administered rectally, e.g., introduced through the rectum into the lower or 
upper colon. This approach is particularly useful m the treatment of, inflammatory disorders, 
disorders characterized by unwanted cell proliferation, e.g., polyps, or colon cancer. 

In some embodiments the medication is delivered to a site in the colon by introducing 
a dispensing device, e.g., a flexible, camera-guided device similai- to that used for inspection 
of the colon or removal of polyps, which includes means for delivery of the medication. 

In one embodiment, the rectal administration of the iRNA agent is by means of an 
enema. The iRNA agent of the enema can be dissolved in a saline or buffered solution. 

In another embodiment, the rectal administration is by means of a suppository. The 
suppository can include other ingredients, e.g., an excipient, e.g., cocoa butter or 
hydropropylmethy Icellulose . 

The invention also provides methods, compositions, and kits for oral delivery of 

iRNA agents described herein. 

Accordingly, an iRNA agent, e.g., a double-slranded iRNA agent, or sRNA agent, 
{e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof) described herein, e.g., a therapeutically effective amount of a iRNA 
described herein, e.g., a iRNA agent having a double stranded region of less than 40 and 
preferably less than 30 nucleotides and having one or two 1-3 nucleotide single strand 3* 

overhangs can be administered orally. 

Oral administration can be in the form of tablets, capsules, gel capsules, lozenges, 
troches or liquid syrups. In a preferred embodiment the composition is applied topically to a 

surface of the oral cavity. 

The invention also provides methods, compositions, and kits for buccal delivery of 

iRNA agents described herein. 

Accordingly, an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
{e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof) described herein, e.g., a therapeutically effective amoimt of iRNA agent 
having a double stranded region of less tlian 40 and preferably less than 30 nucleotides and 
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having one or two 1-3 nucleotide single strand 3' overhangs can be administered to the buccal 
cavity. The medication can be sprayed into the buccal cavity or appUed directly, e.g., in a 
Uquid, soUd, or gel form to a surface in the buccal cavity. This administration is particularly 
desirable for the treatment of inflammations of the buccal cavity, e.g., the gums or tongue, 
e.g., in one embodiment, the buccal administration is by spraying into the cavity, e.g., 
without inhalation, from a dispenser, e.g., a metered dose spray dispenser that dispenses the 
pharmaceutical composition and a propellant. 

The invention also provides methods, compositions, and kits for ocular delivery of 

iRNA agents described herein. 

Accordingly, an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
(e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof) described herein, e.g., a therapeutically effective amount of a iRNA agent 
described herein, e.g., a sRNA agent having a double stranded region of less than 40 and 
preferably less than 30 nucleotides and having one or two 1-3 nucleotide single strand 3* 
overhangs can be administered to ocular tissue. 

The medications can be appUed to the surface of the eye or nearby tissue, e.g., the 
inside of the eyelid. It can be applied topically, e.g., by spraying, in drops, as an eyewash, or 
an ointment. Administration can be provided by the subject or by another person, e.g., a 
health care provider. The medication can be provided in measured doses or in a dispenser 

that delivers a metered dose. 

The medication can also be administered to the interior of the eye, and can be 
introduced by a needle or other delivery device which can introduce it to a selected area or 
structure. 

Ocular treatment is particularly desirable for treating inflammation of the eye or 
nearby tissue. 

The invention also provides methods, compositions, and kits for delivery of iRNA 
agents described herein to or through the skin. 

Accordingly, an iRNA agent, e.g., a double- stranded iRNA agent, or sRNA agent, 
a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
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precursor thereof) described herein, e.g., a therapeutically effective amount of a iRNA agent 
described herein, e.g., a sRNA agent having a double stranded region of less than 40 and 
preferably less than 30 nucleotides and one or two 1-3 nucleotide single strand 3' overhangs 
can be administered directly to the skin. 

The medication can be applied topically or delivered in a layer of the skin, e.g., by the 
use of a microneedle or a battery of microneedles which penetrate into the skin, but 
preferably not into the underlying muscle tissue. 

In one embodiment, the administration of the iRNA agent composition is topical. In 
another embodiment, topical administration delivers the composition to the dermis or 
epidermis of a subject. In other embodiments the topical administration is in the form of 
transdennal patches, ointments, lotions, creams, gels, drops, suppositories, sprays, liquids or 
powders. A composition for topical administration can be formulated as a liposome, micelle, 
emxxlsion, or other lipophilic molecular assembly. 

In another embodiment, the transdermal administration is applied with at least one 
penetration enhancer. In other embodiments, the penetration can be enhanced with 
iontophoresis, phonophoresis, and sonophoresis. In another aspect, the invention provides 
methods, compositions, devices, and kits for puhnonary delivery of iRNA agents described 
herein. 

Accordmgly, an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
{e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof) described herein, e.g., a tlierapeutically effective amount of iRNA agent, 
e.g., a sRNA agent having a double stranded region of less than 40, preferably less than 30 
nucleotides and having one or two 1-3 nucleotide smgle strand 3' overhangs can be 
administered to the pulmonary system. Pulmonary administration can be achieved by 
inhalation or by the introduction of a delivery device into the pulmonary system, e.g., by 
introducing a delivery device which can dispense the medication. 

The preferred method of pulmonary delivery is by inhalation. The medication can be 
provided in a dispenser which delivers the medication, e.g., wet or dry, m a form sufficiently 
small such that it can be inhaled. The device can deliver a metered dose of medication. Tlie 
subject, or another person, can administer the medication. 
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Pulmonary delivery is effective not only for disorders which directly affect 
pulmonary tissue, but also for disorders which affect other tissue. 

iRNA agents can be formulated as a liquid or nonliquid, e.g., a powder, crystal, or 

aerosol for pulmonary delivery. 

In another aspect, the invention provides methods, compositions, devices, and kits for 
nasal delivery of iRNA agents described herein. Accordingly, an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can 
be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, or precursor thereof) described herein, e.g., a, 
therapeutically effective amount of iRNA agent, e.g., a sRNA agent having a double stranded 
region of less than 40 and preferably less than 30 nucleotides and having one or two 1-3 
nucleotide single strand 3' overhangs can be administered nasally. Nasal administration can 
be achieved by introduction of a delivery device into the nose, e.g., by introducing a delivery 
device which can dispense the medication. 

The preferred method of nasal delivery is by spray, aerosol, liquid, e.g., by drops, of 
by topical administration to a surface of tlie nasal cavity. The medication can be provided in 
a dispenser which delivery of the medication, e.g., wet or dry, in a form sufficiently small 
such that it can be inhaled. The device can deliver a metered dose of medication. The 
subject, or another person, can administer the medication. 

Nasal delivery is effective not only for disorders which directly affect nasal tissue, but 

also for disorders which affect otlier tissue 

iRNA agents can be formulated as a liquid or nonliquid, e.g., a powder, crystal, or for 

nasal delivery. 

In another embodiment, the iRNA agent is packaged in a viral natui'al capsid or in a 
chemically or enzymatically produced artificial capsid or structure derived therefrom. 

In one aspect, of the invention, the dosage of a pharmaceutical composition including 
a iRNA agent is administered in order to alleviate the symptoms of a disease state, e.g., 
cancer or a cardiovascular disease. 

In another aspect, gene expression in a subject is modulated by administering a 
pharmaceutical composition including a iRNA agent. In other embodiments, a subject is 
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treated with the pharmaceutical composition by any of the methods mentioned above. In 
another embodiment, the subject has cancer. 

An iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a 
precursor, e.g., sl larger iRNA agent which can be processed into a sRNA agent, or a DNA 
which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof) composition can be administered as a liposome. For example, the 
composition can be prepared by a method that includes: (1) contacting a iRNA agent with an 
ampliipathic cationic lipid conjugate in the presence of a detergent; and (2) removing the 
detergent to form a iRNA agent and cationic lipid complex. In one embodiment, the 
detergent is cholate, deoxycholate, lauryl sarcosine, octanoyl sucrose, CHAPS (3-[(3- 
cholamidopropyl)-di-methylamine]-2-hydroxyl-l-propane), novel-p-D-glucopyranoside, 

lauryl dimethylamine oxide, or octylglucoside. The iRNA agent can be an sRNA agent. The 
method can include preparmg a composition that includes a plurality of iRNA agents, e.g., 
specific for one or more different endogenous target RNAs. The method can include other 

featxjres described herein. 

In another aspect, a subject is treated by administering a defined amount of an iRNA 
agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e,g., a larger 
iRNA agent which can be processed into a sRNA agent) composition that is in a powdered 
form. In one embodiment, tlie powder is a collection of microparticles. In one embodiment, 
the powder is a collection of crystalline particles. The composition can include a plurality of 
iRNA agents, e.g., specific for one or more different endogenous target RNAs. The method 
can include other features described herein. 

In one aspect, a subject is treated by administering a defined amount of a iRNA agent 
composition that is prepared by a method that mcludes spray-drying, i.e. atomizing a liquid 
solution, emulsion, or suspension, immediately exposing the droplets to a drying gas, and 
collecting the resulting porous powder particles. The composition can hiclude a plurality of 
iRNA agents, e.g., specific for one or more different endogenous target RNAs. The method 
can include other features described herein. 

In one aspect, the iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
(e.g., a precursor, e.g., a larger iRNA agent wliich can be processed into a sRNA agent, or a 
DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
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precursor thereof), is provided in a powdered, crystallized or other finely divided form, with 
or without a carrier, e.g., a micro- or nano-particle suitable for inhalation or other pulmonary 
delivery. In one embodiment, this includes providing an aerosol preparation, e.g., an 
aerosolized spray-dried composition. The aerosol composition can be provided in and/or 
dispensed by a metered dose delivery device. 

In another aspect a subject is treated for a condition treatable by inhalation. In one 
embodiment, this method includes aerosolizing a spray-dried iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a lai'ger iRNA agent which can 
be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, or precursor thereof) composition and inhaling the 
aerosolized composition. The iRNA agent can be an sRNA. The composition can include a 
plurality of iRNA agents, e.g., specific for one or more different endogenous target RNAs, 
The method can include otlier features described herein. 

In another aspect, the invention features a method of treating a subject that includes: 
administering a composition including an effective/defined amount of an iRNA agent, e.g., a 
double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent 
which can be processed into a sRNA agent, or a DNA wliich encodes an iRNA agent, e.g„ a 
double-stranded iRNA agent, or sRNA agent, or precursor thereof), wherein the composition 
is prepared by a method that includes spray-drying, lyophilization, vacuum drying, 
evaporation, fluid bed drying, or a combination of these techniques 

In another aspect, the invention features a method that includes: evaluating a 
parameter related to the abundance of a transcript in a cell of a subject; comparing the 
evaluated parameter to a reference value; and if the evaluated parameter has a preselected 
relationship to the reference value (e.g., it is greater), administering a iRNA agent (or a 
precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a DNA 
which encodes a iRNA agent or precursor thereof) to the subject. In one embodiment, the 
iRNA agent includes a sequence that is complementary to the evaluated transcript. For 
example, the parameter can be a direct measure of transcript levels, a measure of a protein 
level, a disease or disorder symptom or characterization (e.g., rate of cell proliferation and/or 
tumor mass, viral load,) 
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In another aspect, the invention features a method that includes: administering a first 
amount of a composition that comprises an iRNA agent, e.g., a double-stranded iRNA agent, 
or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can be processed into a 
sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, 
or sRNA agent, or precursor thereof) to a subject, wherein the iRNA agent includes a strand 
substantially complementary to a target nucleic acid; evaluating an activity associated with a 
protein encoded by the target nucleic acid; wherein the evaluation is used to determine if a 
second amount should be administered. In a preferred embodiment the method includes 
administering a second amount of the composition, wherein the timing of administration or 
dosage of the second amount is a function of the evaluating. The method can include other 

features described herein. 

In another aspect, the invention features a method of administering a source of a 
double-stranded iRNA agent (ds iRNA agent) to a subject. The method includes 
administering or implanting a source of a ds iRNA agent, e.g., a sRNA agent, that (a) 
includes a double-stranded region that is 19-25 nucleotides long, preferably 21-23 
nucleotides, (b) is complementary to a target RNA (e.g., an endogenous RNA or a pathogen 
RNA), and, optionally, (c) includes at least one 3' overhang 1-5 nt long. In one embodiment, 
the source releases ds iRNA agent over time, e.g. the source is a controlled or a slow release 
source, e.g., a microparticle that gradually releases the ds iRNA agent. In another 
embodiment, the source is a pump, e.g., a pump that includes a sensor or a pump that can 

release one or more unit doses. 

In one aspect, the invention features a pharmaceutical composition that includes an 
iRNA agent, e.g., a double-sti^anded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
larger iRNA agent wWch can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) 
including a nucleotide sequence complementary to a target RNA, e.g., substantially and/or 

K 

exactly complementary. The target RNA can be a transcript of an endogenous human gene. 
In one embodiment, the iRNA agent (a) is 19-25 nucleotides long, preferably 21-23 
nucleotides, (b) is complementary to an endogenous target RNA, and, optionally, (c) includes 
at least one 3' overhang 1-5 nt long. In one embodiment, the pharmaceutical composition can 
be an emulsion, microemulsion, cream, jelly, or liposome. 
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In one example the pharmaceutical composition includes an iRNA agent mixed with a 
topical delivery agent. The topical delivery agent can be a plurality of microscopic vesicles. 
The microscopic vesicles can be liposomes. In a preferred embodiment the liposomes are 
cationic liposomes. 

In another aspect, the pharmaceutical composition includes an iRNA agent, e,g., a 
double-stranded iRNA agent, or sRNA agent {e.g., a precursor, e.g., a larger iRNA agent 
which can be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a 
double-stranded iRNA agent, or sRNA agent, or precursor thereof) admixed with a topical 
penetration enhancer. In one embodiment, the topical penetration enhancer is a fatty acid. 
The fatty acid can be arachidonic acid, oleic acid, lauric acid, caprylic acid, capric acid, 
myristic acid, palmitic acid, stearic acid, linoleic acid, linolenic acid, dicaprate, tricaprate, 

ft 

monolein, dilaurin, glyceryl 1 -monocaprate, 1 -dodecylazacycloheptan-2-one, an 
acylcamitine, an acylcholine, or a Cmo alkyl ester, monoglyceride, diglyceride or 
pharmaceutically acceptable salt thereof. 

In another embodiment, the topical penetration enhancer is a bile salt. The bile salt 
can be choUc acid, dehydrocholic acid, deoxycholic acid, gluchoUc acid, glycholic acid, 
glycodeoxycholic acid, taurocholic acid, taurodeoxycholic acid, chenodeoxycholic acid, 
ursodeoxycholic acid, sodium tauro-24,25-dihydro-fusidate, sodium glycodihydrofiisidate, 
polyoxyethylene-9-lauryl ether or a pharmaceutically acceptable salt thereof 

In another embodiment, the penetration enhancer is a chelating agent. The chelating 
agent can be EDTA, citric acid, a salicyclate, a N-acyl derivative of collagen, laureth-9, an 
N-amino acyl derivative of a beta-diketone or a mixture thereof 

In another embodiment, the penetration enhancer is a surfactant, e.g., an ionic or 
nonionic surfactant. The surfactant can be sodium lauryl sulfate, polyoxyethylene-9-lauryl 
ether, polyoxyethylene-20-cetyl ether, a perfluorchemical emulsion or mixture thereof 
In another embodiment, the penetration enhancer can be selected from a group 
consisting of unsaturated cyclic ureas, 1-alkyl-alkones, 1-alkenylazacyclo-alakanones, 
steroidal anti-inflammatory agents and mixtures thereof. In yet another embodiment the 
penetration enhancer can be a glycol, a pyrrol, an azone, or a terpenes. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
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larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) in a 
form suitable for oral delivery. In one embodiment, oral delivery can be used to deliver an 
iRNA agent composition to a cell or a region of the gastro-intestinal tract, e.g., small 
intestine, colon (e.g., to treat a colon cancer), and so forth. The oral delivery form can be 
tablets, capsules or gel capsules. In one embodiment, the iRNA agent of the pharmaceutical 
composition modulates expression of a cellular adhesion protein, modulates a rate of cellular 
proliferation, or has biological activity against eukaryotic paliiogens or retroviruses. In 
another embodiment, the pharmaceutical composition includes an enteric material that 
substantially prevents dissolution of the tablets, capsules or gel capsules in a mammalian 
stomach. In a preferred embodiment the enteric material is a coating. The coating can be 
acetate phthalate, propylene glycol, sorbitan monoleate, cellulose acetate trimellitate, 
hydroxy propyl methylcellulose phthalate or cellulose acetate phthalate. 

In another embodunent, the oral dosage form of the pharmaceutical composition 
includes a penetration enhancer. The penetration enhancer can be a bile salt or a fatty acid. 
The bile salt can be ursodeoxycholic acid, chenodeoxycholic acid, and salts thereof. The 
fatty acid can be capric acid, lauric acid, and salts thereof. 

In another embodiment, the oral dosage form of the pharmaceutical composition 
includes an excipient. In one example the excipient is polyethyleneglycol. In another 

example the excipient is preciroL 

In another embodiment, the oral dosage form of the pharmaceutical composition 
includes a plasticizer. The plasticizer can be diethyl phthalate, triacetin dibutyl sebacate, 

dibutyl phthalate or triethyl citrate. 

In one aspect, the invention featm-es a phannaceutical composition including an 
iRNA agent and a delivery vehicle. In one embodiment, the iRNA agent is (a) is 19-25 
nucleotides long, preferably 21-23 nucleotides, (b) is complementary to an endogenous target 
RNA, and, optionally, (c) includes at least one 3' overhang 1-5 nucleotides long. 

In one embodiment, the delivery vehicle can deliver an iRNA agent, e.g., a double- 
sti-anded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can 
be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, or precui-sor tliereof) to a cell by a topical route of 
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administration. The delivery vehicle can be microscopic vesicles. In one example the 
microscopic vesicles are liposomes. In a preferred embodiment the liposomes are cationic 
liposomes. In another example the microscopic vesicles are micelles. 

In one aspect, the invention features a method for making a pharmaceutical 
composition, the method including: (1) contacting an iRNA agent, e.g., a double-stranded 
IRNA agent, or sRNA agent, (e.g., a precursor, e.g., sl larger iRNA agent which can be 
processed into a sRNA agent) with a amphipathic cationic lipid conjugate m the presence of 
a detergent; and (2) removing the detergent to form a iRNA agent and cationic lipid complex. 

In another aspect, tiie invention features a pharmaceutical composition produced by a 
method including: (1) contacting an iRNA agent, e.g., a double-stranded iRNA agent, or 
sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can be processed into a 
sRNA agent) with a amphipathic cationic lipid conjugate in the presence of a detergent; and 
(2) removing the detergent to form a iRNA agent and cationic lipid complex. In one 
embodiment, the detergent is cholate, deoxycholate, lauryl sarcosine, octanoyl sucrose, 
CHAPS (3-[(3-cholamidopropyl)-di-metliylamine]-2-hydroxyl-l-propane),novel-p-D- 
glucopyranoside, lauryl dimethylamine oxide, or octylglucoside. In another embodiment, the 
amphipathic cationic lipid conjugate is biodegradable. In yet another embodiment the 
pharmaceutical composition includes a targeting ligand. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) in an 
injectable dosage form. In one embodiment, the injectable dosage form of the 
pharmaceutical composition includes sterile aqueous solutions or dispersions and sterile 
powders. In a preferred embodiment the sterile solution can include a diluent such as water; 
saline solution; fixed oils, polyethylene glycols, glycerin, or propylene glycol. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) in 
oral dosage form. In one embodiment, the oral dosage form is selected from the group 
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consisting of tablets, capsules and gel capsules. In another embodiment, the pharmaceutical 
composition includes an enteric material that substantially prevents dissolution of the tablets, 
capsules or gel capsules in a mammalian stomach. In a preferred embodiment the enteric 
material is a coating. The coating can be acetate phthalate, propylene glycol, sorbitan 
monoleate, cellulose acetate trimellitate, hydroxy propyl methyl cellulose phthalate or 
cellulose acetate phthalate. In one embodiment, the oral dosage form of the pharmaceutical 
composition includes a penetration enhancer, e.g., a penetration enhancer described herein. 

In another embodiment, the oral dosage form of the pharmaceutical composition 
includes an excipient. In one example the excipient is polyethyleneglycoL In another 
example the excipient is precirol. 

In another embodiment, the oral dosage form of the pharmaceutical composition 
includes a plasticizer. The plasticizer can be diethyl phthalate, triacetin dibutyl sebacate, 

dibutyl phthalate or triethyl citrate. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
larger iRNA agent wliich can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) in a 
rectal dosage form. In one embodiment, the rectal dosage form is an enema. In another 
embodiment, the rectal dosage form is a suppositoiy. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) in a 
vaginal dosage form. In one embodiment, the vaginal dosage form is a suppository. In 
another embodiment, the vaginal dosage form is a foam, cream, or gel. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) in a 
puhnonary or nasal dosage form. In one embodiment, the iRNA agent is incorporated into a 
particle, e.g., a macroparticle, e.g., a microsphere. The particle can be produced by spray 
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drying, lyophilization, evaporation, fluid bed drying, vacuum drying, or a combination 
thereof. The microsphere can be formulated as a suspension, a powder, or an implantable 

solid. 

In one aspect, the invention features a spray-dried iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can 
be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, or precursor thereof) composition suitable for 
inhalation by a subject, including: (a) a therapeutically effective amount of a iRNA agent 
suitable for treating a condition in the subject by mhalation; (b) a pharmaceutically 
acceptable excipient selected from the group consisting of carbohydrates and amino acids; 
and (c) optionally, a dispersibility-enhancing amount of a physiologically-acceptable, water- 
soluble polypeptide. 

In one embodiment, the excipient is a carbohydrate. The carbohydrate can be 
selected from the group consisting of monosaccharides, disaccharides, trisaccharides, and 
polysaccharides. In a preferred embodunent the carbohydrate is a monosaccharide selected 
from the group consisting of dextrose, galactose, mannitol, D-mannose, sorbitol, and sorbose. 
In another preferred embodiment the carbohydrate is a disaccharide selected from the group 
consisting of lactose, maltose, sucrose, and trehalose. 

In another embodiment, the excipient is an amino acid. In one embodiment, the 
amino acid is a hydrophobic amino acid. In a preferred embodiment the hydrophobic amino 
acid is selected from the group consisting of alanine, isoleucine, leucine, methionine, 
phenylalanine, proline, tryptophan, and valine. In yet another embodiment the amino acid is a 
polar amino acid. In a preferred embodunent the amino acid is selected from the group 
consisting of arginine, histidine, lysine, cysteine, glycine, glutamine, serine, threonine, 
tyrosine, aspartic acid and glutamic acid. 

In one embodiment, tlie dispersibility-enhancing polypeptide is selected from the 
group consisting of human serum albumin, a-lactalbumin, trypsinogen, and polyalanine. 

In one embodiment, the spray-dried iRNA agent composition includes particles 
having a mass median diameter (MMD) of less than 1 0 microns. In another embodunent, 
tlie spray-dried iRNA agent composition mcludes particles having a mass median diameter of 
less than 5 microns. In yet anotlier embodiment the spray-dried iRNA agent composition 
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includes particles having a mass median aerodynamic diameter (MMAD) of less than 5 
microns. 

In certain other aspects, the invention provides kits that include a suitable container 
containing a pharmaceutical formulation of an iRNA agent, e.g., a double-stranded iRNA 
agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can be processed 
into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double-sti-anded iRNA 
agent, or sRNA agent, or precursor thei-eof). In certain embodiments the individual 
components of the pharmaceutical formulation may be provided in one container. 
Alternatively, it may be desirable to provide the components of the pharmaceutical 
formulation separately in two or more containers, e.g., one container for an iRNA agent 
prqsaration, and at least another for a carrier compound. The kit may be packaged in a 
number of different configurations such as one or more containers in a single box. The 
different components can be combined, e.g., according to instructions provided with the kit. 
The components can be combined according to a method described herein, e.g., to prepare 
and administer a pharmaceutical composition. The kit can also include a delivery device. 

In another aspect, the invention features a device, e.g., an implantable device, wherein 
the device can dispense or administer a composition that includes an iRNA agent, e.g., a 
double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent 
which can be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a 
double-stranded iRNA agent, or sRNA agent, or precursor thereof), e.g., a iRNA agent that 
silences an endogenous transcript, hi one embodiment, the device is coated with the 
composition. In another embodiment the iRNA agent is disposed within the device. In 
another embodiment, the device includes a mechanism to dispense a unit dose of tire 
composition. In other embodunents the device releases the composition continuously, e.g., 
by diffusion. Exemplary devices include stents, catheters, pumps, artificial organs or organ 
components (e.g., artificial heart, a heart valve, etc.), and sutures. 

As used herein, the term "crystalline" describes a solid having the structure or 
characteristics of a crystal, i.e., particles of three-dimensional structure in which the plane 
faces intersect at definite angles and in which tliere is a regular internal structure. The 
compositions of the mvention may have different crystalline forms. Crystalline forms can be 
prepared by a variety of methods, including, for example, spray drying. 
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As used herein, "specifically hybridizable" and "complementary" are terms which are 
used to indicate a sufficient degree of complementarity such that stable and specific binding 
occurs between a compound of the invention and a target RNA molecule. Specific binding 
requires a sufficient degree of complementarity to avoid non-specific binding of the 
oligomeric compound to non-target sequences under conditions in which specific binding is 
desired, i.e., under physiological conditions in the case of in vivo assays or therapeutic 
treatment, or in the case of in viti^o assays, under conditions in which the assays are 
performed. The non-target sequences typically differ by at least 5 nucleotides. 

In one embodiment, an iRNA agent is "sufficiently complementary" to a target RNA, 
e.g., a target mRNA, such that the iRNA agent silences production of protein encoded by the 
target mRNA. In another embodiment, the iRNA agent is "exactly complementary" to a 
target RNA, e.g., the target RNA and the iRNA agent anneal, preferably to form a hybrid 
made exclusively of Watson-Crick basepairs in the region of exact complementarity. A 
"sufficiently complementary" target RNA can include an internal region {e.g., of at least 10 
nucleotides) that is exactly complementary to a target RNA. Moreover, in some 
embodiments, the iRNA agent specifically discriminates a single-nucleotide difference. In 
this case, the iRNA agent only mediates RNAi if exact complementary is found in the region 
{e.g., within 7 nucleotides of) the single-nucleotide difference. 

As used herein, the term "oligonucleotide" refers to a nucleic acid molecule (RNA or 
DNA) preferably of length less than 100, 200, 300, or 400 nucleotides. 

Unless otherwise defined, all technical and scientific terms used herein have the same 
meaning as commonly understood by one of ordinary skill in the art to which this invention 
pertains. The materials, methods, and examples are illustrative only and not intended to be 
luniting. Although methods and materials similar or equivalent to those described herein can 
be used in the practice or testing of the present invention, useful metiiods and materials are 
described below. Other features and advantages of the invention will be apparent firom the 
accompanying drawings and description, and from the claims. The contents of all references, 
pending patent applications and published patents, cited throughout this application are 
hereby expressly incorporated by reference. In case of conflict, the present specification, 
including definitions, will control. 
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BRIEF DESCRIPTION OF THE DRAWINGS 
FIG. 1 is a structural representation of base pairing in psuedocomplementary siRNA . 
FIG. 2 is a schematic representation of dual targeting siRNAs designed to target the 
HCV genome. 

FIG. 3 is a schematic representation of psuedocomplementary , bifunctional siRNAs 

designed to target the HCV genome. 

FIG. 4 is a general synthetic scheme for incorporation of RRMS monomers into an 

oligonucleotide. 

FIG. 5 is a table of representative RRMS earners. Panel 1 shows pyrroline-based 
RRMSs; panel 2 shows 3-hydroxyproline-based RRMSs; panel 3 shows piperidine-based 
RRMSs; panel 4 shows morpholine and piperazine-based RRMSs; and panel 5 shows 
decalin-based RRMSs. Rl is succinate or phosphoramidate and R2 is H or a conjugate 
ligand. 

FIG. 6A. is a graph depicting levels of luciferase mRNA in livers of CMV-Luc mice 
(Xanogen) following intervenous injection (iv) of buffer or siRNA into the tail vein. Each 
bar represents data from one mouse. RNA levels were quantified by QuantiGene Assay 
(Genospectra, Inc.; Fremont, CA)). The Y axis represents chemiluminescence values in 

counts per second (CPS). 

FIG. 6B. is a graph depicting levels of luciferase mRNA in livers of CMV-Luc mice 
(Xanogen). The values are averaged from the data depicted in FIG. XxxA. 

FIG. 7 is a graph depicting the pharmacokinetics of cholesterol-conjugated and 

33 

unconjugated siRNA. The diamonds represent the amount of xmconjugated P-labeled 
siRNA (ALN-3000) in mouse plasma over tune; the squares represent the amount of 
cholesterol-conjugated ^^P-labeled siRNA (ALN-3001) in mouse plasma over time. "L1163' 
is equivalent to ALN3000; "L1163Chol" is equivalent to ALN-3001. 

DETAILED DESCRIPTION 

Double-stranded (dsRNA) directs the sequence-specific silencing of mRNA through 
process known as RNA interference (RNAi). The process occurs in a wide variety of 
organisms, including mammals and other vertebrates. 
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It has been demonstrated that 21-23 nt fragments of dsRNA are sequence-specific 
mediators of RNA silencing, e.g., by causing RNA degradation. While not wishing to be 
bound by theory, it may be tiiat a molecular signal, which may be merely tlie specific length 
of the fragments, present in these 21-23 nt fragments recruits cellular factors that mediate 
RNAi. Described herein are methods for preparing and administermg these 21-23 nt 
fragments, and other iRNAs agents, and their use for specifically inactivating gene ftmction. 
The use of iRNAs agents (or recombinantly produced or chemically synthesized 
oligonucleotides of the same or similar nature) enables the targeting of specific mRNAs for 
silencing in mammalian cells. In addition, longer dsRNA agent fragments can also be used, 

e,g,, as described below. 

Although, in mammalian cells, long dsRNAs can induce the interferon response 
which is firequently deleterious, sRNAs do not trigger the interferon response, at least not to 
an extent that is deleterious to the cell and host. In particular', the length of the iRNA agent 
strands in an sRNA agent can be less than 31, 30, 28, 25, or 23 nt, e.g., sufficiently short to 
avoid inducing a deleterious interferon response. Thus, the administration of a composition 
of sRNA agent (e.g., formulated as described herein) to a mammalian cell can be used to 
silence expression of a target gene while curcumventing the interferon response. Further, use 
of a discrete species of iRNA agent can be used to selectively tai'get one allele of a target 
gene, e.g., in a subject heterozygous for the allele. 

Moreover, in one embodiment, a mammalian cell is treated with an iRNA agent tiiat 
disrupts a component of the interferon response, e.g., double stranded RNA (dsRNA)- 
activated protein kinase PKR. Such a cell can be treated with a second iRNA agent that 
includes a sequence complementary to a target RNA and that has a length that might 
otherwise trigger the interferon response. 

In a typical embodiment, the subject is a mammal such as a cow, horse, mouse, rat, 
dog, pig, goat, or a primate. The subject can be a daky mammal (e.g., a cow, or goat) or 
other farmed animal {e.g., a chicken, turkey, sheep, pig, fish, shrimp). In a much preferred 
embodiment, the subject is a human, e.g., a normal individual or an individual that has, is 
diagnosed with, or is predicted to have a disease or disorder. 

Further, because iRNA agent mediated silencing persists for several days after 
administering the iRNA agent composition, in many instances, it is possible to administer the 
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composition with a frequency of less than once per day, or, for some instances, only once for 
the entire therapeutic regimen. For example, treatment of some cancer cells may be 
mediated by a single bolus administration, whereas a chronic viral infection may require 
regular administration, e.g., once per week or once per month. 

A number of exemplary routes of delivery are described that can be used to 
administer an iRNA agent to a subject. In addition, the iRNA agent can be formulated 
according to an exemplary method described herein. 

iRNA AGENT STRUCTURE 

Described herein are isolated iRNA agents, e.g., RNA molecules, (double-stranded; 
single-stranded) that mediate RNAi. The iRNA agents preferably mediate RNAi with 
respect to an endogenous gene of a subject or to a gene of a pathogen. 

An "RNA agent" as used herein, is an unmodified RNA, modified RNA, or 
nucleoside surrogate, all of which are defined herein (see, e.g., the section below entitled 
RNA Agents). While numerous modified RNAs and nucleoside surrogates are described, 
preferred examples include those which have greater resistance to nuclease degradation than 
do unmodified RNAs. Preferred examples include those which have a 2' sugar modification, 
a modification in a single strand overhang, preferably a 3' single strand overhang, or, 
particularly if single stranded, a 5' modification which includes one or more phosphate 
groups or one or more analogs of a phosphate group. 

An "iRNA agent" as used herein, is an RNA agent which can, or which can be 
cleaved into an RNA agent which can, down regulate the expression of a target gene, 
preferably an endogenous or pathogen target RNA. While not wishing to be bound by 
theory, an iRNA agent may act by one or more of a number of mechanisms, including post- 
transcriptional cleavage of a target mRNA sometimes referred to in the art as RNAi, or pre- 
ti-anscriptional or pre-translational mechanisms. An iRNA agent can include a single sti'and 
or can include more than one strands, e.g., it can be a double stranded iRNA agent. If the 
iRNA agent is a single strand it is particularly preferred that it include a 5' modification 
which includes one or more phosphate groups or one or more analogs of a phosphate group. 

The iRNA agent should include a region of sufficient homology to tlie target gene, 
and be of sufficient length in terms of nucleotides, such that tlie iRNA agent, or a fragment 
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thereof, can mediate down regulation of the target gene. (For ease of exposition the term 
nucleotide or ribonucleotide is sometimes used herein in reference to one or more monomeric 
subunits of an RNA agent. It will be understood herein that the usage of the term 
"ribonucleotide" or "nucleotide", herein can, in the case of a modified RNA or nucleotide 
surrogate, also refer to a modified nucleotide, or surrogate replacement moiety at one or more 
positions.) Thus, the iRNA agent is or includes a region v/hich is at least partially, and in 
some embodiments fully, complementary to the target RNA. It is not necessary that there be 
perfect complementarity between tlie iRNA agent and the target, but the correspondence 
must be sufficient to enable the iRNA agent, or a cleavage product thereof, to direct sequence 
specific silencing, e.g., by RNAi cleavage of the target RNA, e.g., mRNA. 

Complementarity, or degree of homology with the target strand, is most critical in the 
antisense strand. While perfect complementarity, particularly m the antisense strand, is often 
desired some embodiments can include, particularly in the antisense strand, one or more but 
preferably 6, 5, 4, 3, 2, or fewer mismatches (with respect to the target RNA). The 
mismatches, particularly in the antisense strand, are most tolerated in the terminal regions 
and if present are preferably in a terminal region or regions, e.g., withhi 6, 5, 4, or 3 
nucleotides of the 5' and/or 3' terminus. The sense strand need only be sufficiently 
complementary with the antisense strand to maintain the over all double strand character of 
the molecule. 

As discussed elsewhere herein, an iRNA agent will often be modified or include 
nucleoside surrogates in addition to the RRMS. Single stranded regions of an iRNA agent 
will often be modified or include nucleoside surrogates, e.g., the unpaired region or regions 
of a hairpin structure, e.g., a region which links two complementary regions, can have 
modifications or nucleoside surrogates. Modification to stabilize one or more 3'- or 5'- 
terminus of an iRNA agent, e.g., against exonucleases, or to favor the antisense sRNA agent 
to enter into RISC are also favored. Modifications can include C3 (or C6, C7, CI 2) amino 
linlcers, thiol linkers, carboxyl linkers, non-nucleotidic spacers (C3, C6, C9, CI 2, abasic, 
triethylene glycol, hexaethylene glycol), special biotin or fluorescein reagents that come as 
phosphoramidites and that have another DMT-protected hydroxyl group, allowing multiple 
couplings during RNA synthesis. 
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iRNA agents include: molecules that are long enough to trigger the interferon 
response (which can be cleaved by Dicer (Bernstein et aL 2001, Nature, 409:363-366) and 
enter a RISC (RNAi-induced silencing complex)); and, molecules which are sufficiently 
short that they do not trigger the interferon response (which molecules can also be cleaved by 
Dicer and/or enter a RISC), e.g., molecules which are of a size which allows entry into a 
RISC, e.g., molecules which resemble Dicer-cleavage products. Molecules that are short 
enough that they do not trigger an interferon response are termed sRNA agents or shorter 
iRNA agents herein. "sRNA agent or shorter iRNA agent" as used herein, refers to an iRNA 
agent, e.g., a double stranded RNA agent or single strand agent, that is sufficiently short that 
it does not induce a deleterious interferon response in a human cell, e.g,, it has a duplexed 
region of less than 60 but preferably less than 50, 40, or 30 nucleotide pahs. The sRNA 
agent, or a cleavage product thereof, can down regulate a target gene, e.g., by inducing RNAi 
with respect to a target RNA, preferably an endogenous or pathogen target RNA. 

Each strand of an sRNA agent can be equal to or less than 30, 25, 24, 23, 22, 21 , or 20 
nucleotides in length. The strand is preferably at least 19 nucleotides in length. For example, 
each strand can be between 21 and 25 nucleotides in length. Preferred sRNA agents have a 
duplex region of 17, 18, 19, 29, 21, 22, 23, 24, or 25 nucleotide pairs, and one or more 
overhangs, preferably one or two 3' overhangs, of 2- 3 nucleotides. 

In addition to homology to target RNA and the ability to down regulate a target gene, 
an iRNA agent will preferably have one or more of the following properties: 

(1) it will be of the Formula 1, 2, 3, or 4 set out in the RNA Agent section below; 

(2) if single stranded it will have a 5 ' modification which includes one or more 
phosphate groups or one or more analogs of a phosphate group; 

(3) it will, despite modifications, even to a very large number, or all of the 
nucleosides, have an antisense strand that can present bases (or modified bases) in the proper 
tliree dimensional framework so as to be able to form correct base pairing and form a duplex 
structure with a homologous target RNA which is sufficient to allow down regulation of the 
target, e.g., by cleavage of the target RNA; 

(4) it will, despite modifications, even to a very large nmnber, or all of the 
nucleosides, still have "RNA-like" properties, i.e., it will possess the overall structural, 
chemical and physical properties of an RNA molecule, even though not exclusively, or even 
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partly, of ribonucleotide-based content. For example, an iRNA agent can contain, e.g., a 
sense and/or an antisense strand in which all of the nucleotide sugars contain e.g., 2' fluoro in 
place of 2' hydroxyl. This deoxyribonucleotide-containing agent can still be expected to 
exhibit RNA-like properties. While not wishing to be bound by theory, tlie electronegative 
fluorine prefers an axial orientation when attached to the C2' position of ribose. This spatial 
preference of fluorine can, in turn, force the sugars to adopt a Cy-endo pucker. This is the 
same puckering mode as observed in RNA molecules and gives rise to liie RNA- 
characteristic A-family-type helix. Further, since fluorine is a good hydrogen bond acceptor, 
it can participate in the same hydrogen bonding interactions with water molecules that are 
known to stabilize RNA structures. (Generally, it is preferred that a modified moiety at the 
2' sugar position will be able to enter into H-bonding which is more characteristic of the OH 
moiety of a ribonucleotide than the H moiety of a deoxyribonucleotide. A preferred IRNA 
agent will: exhibit a Cy-endo pucker in all, or at least 50, 75,80, 85, 90, or 95 % of its 
sugars; exhibit a Cy-endo pucker in a sufficient amomit of its sugars that it can give rise to a 
the RNA-characteristic A-family-type helbc; will have no more than 20, 10, 5, 4, 3, 2, orl 
sugar which is not a Cy-endo pucker structure. These limitations are particularly preferably 

in the antisense strand; 

(5) regardless of the nature of the modification, and even though the RNA agent 
can contain deoxynucleotides or modified deoxynucleotides, particularly in overhang or 
other single strand regions, it is preferred that DNA molecules, or any molecule in which 
more than 50, 60, or 70 % of the nucleotides in the molecule, or more than 50, 60, or 70 % of 
the nucleotides in a duplexed region are deoxyribonucleotides, or modified 
deoxyribonucleotides which are deoxy at the 2' position, are excluded from the definition of 
RNA agent. 

A "single strand iRNA agent" as used herein, is an iRNA agent which is made up of a 
single molecule. It may include a duplexed region, formed by intra-strand pairing, e.g., it 
may be, or include, a hairpin or pan-handle structure. Single strand iRNA agents are 
preferably antisense with regard to the target molecule, hi preferred embodiments suigle 
strand iRNA agents are 5' phosphorylated or include a phosphoryl analog at the 5' prime 
terminus. 5'-phosphate modifications include those which are compatible with RISC 
mediated gene sUencing. Suitable modifications include: 5'-monophosphate ((H0)2(0)P-0- 
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5'); 5'-diphosphate ((HO)2(0)P-0-P(HO)(0)-0-5'); 5'-triphosphate ((H0)2(0)P-0- 
(H0)(0)P-0-P(H0)(0)-0-5'); 5'-guanosine cap (7-methylated or non-methylated) (7in-G-0- 
5'-.(HO)(0)P-0-(HO)(0)P-0-P(HO)(0)-0-5'); 5'-adenosine cap (Appp), and any modified or 
unmodified nucleotide cap structure (N-0-5'-(HO)(0)P-0-(HO)(0)P-0-P(HO)(0)-0-5'); 5'- 
monothiophosphate (phosphorotliioate; (HO)2(S)P-0-5'); 5'-monodithiophosphate 
(phosphorodithioate; (H0)(HS)(S)P-0-5'), 5'-phosphorothiolate ((HO)2(0)P-S-5'); any 
additional combination of oxygen/sulfur replaced monophosphate, diphosphate and 
triphosphates (e.g. 5'-alpha-thiotriphosphate, 5'-gamma-thiotriphosphate, etc.), 5*- 
phosphoramidates ((HO)2(0)P-NH-5', (HO)(NH2)(O)P-O-50, 5»-alkylphosphonates 
(R-alkyl=methyl, ethyl, isopropyl, propyl, etc., e,g. RP(0H)(0)-0-5*-, (OH)2(0)P-5'"CH2-), 
5'-alkyletherphosphonates (R=alkylether=methoxymethyl (MeOCH2"), ethoxymethyl, etc., 
e,g. RP(0H)(0)-0-5'-). (These modifications can also be used with the antisense strand of a 

double stranded iRNA.) 

A single sti-and iRNA agent should be sufficiently long that it can enter the RISC and 
participate in RISC mediated cleavage of a target mRNA. A single strand iRNA agent is at 
least 14, and more preferably at least 15, 20, 25, 29, 35, 40, or 50nucleotides in length. It is 
preferably less than 200, 100, or 60 nucleotides in length. 

Hairpin iRNA agents will have a duplex region equal to or at least 17, 1 8, 19, 29, 21, 

22, 23, 24, or 25 nucleotide pairs. The duplex region will preferably be equal to or less than 
200, 100, or 50, in length. Preferred ranges for the duplex region are 15-30, 17 to 23, 19 to 

23, and 19 to 21 nucleotides pairs in length. The hairpin wall preferably have a single strand 
overhang or terminal unpaired region, preferably the 3', and preferably of the antisense side 
of the hairpin. Preferred overhangs are 2-3 nucleotides in length. 

A "double stranded (ds) iRNA agent" as used herein, is an iRNA agent which 
includes more than one, and preferably two, strands in which interchain hybridization can 
form a region of duplex structure. 

The antisense strand of a double stranded iRNA agent should be equal to or at least, 
14, 15, 16 17, 18, 19, 25, 29, 40, or 60 nucleotides in length. It should be equal to or less 
than 200, 100, or 50, nucleotides in length. Preferred ranges are 17 to 25, 19 to 23, and 19 
to21 nucleotides in length. 
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The sense sti^and of a double stranded iRNA agent should be equal to or at least 14, 
15, 16 17, 18, 19, 25, 29, 40, or 60 nucleotides in length. It should be equal to or less than 
200, 100, or 50, nucleotides in length. Preferred ranges are 17 to 25, 19 to 23, and 19 to21 

nucleotides in length. 

The double strand portion of a double stranded iRNA agent should be equal to or at 
least, 14, 15, 16 17, 18, 19, 20, 21, 22, 23, 24, 25, 29, 40, or 60 nucleotide pahs in length. It 
should be equal to or less than 200, 100, or 50, nucleotides pahrs in length. Preferred ranges 
are 15-30, 17 to 23, 19 to 23, and 19 to 21 nucleotides pairs in length. 

In many embodiments, tlie ds iRNA agent is sufficiently large that it can be cleaved 
by an endogenous molecule, e.g., by Dicer, to produce smaller ds iRNA agents, e.g., sRNAs 
agents 

It may be desirable to modify one or both of the antisense and sense strands of a 
double strand iRNA agent. In some cases they will have the same modification or the same 
class of modification but in other cases the sense and antisense strand will have different 
modifications, e.g., in some cases it is desirable to modify only the sense strand. It may be 
desirable to modify only the sense strand, e.g., to mactivate it, e.g., the sense strand can be 
modified in order to inactivate the sense strand and prevent formation of an active 
sRNA/protein or RISC. This can be accomplished by a modification which prevents 5'- 
phosphorylation of the sense strand, e.g., by modification witli a 5*-0-methyl ribonucleotide 
(see Nykanen et al, (2001) ATP requirements and small interfering RNA structure in the 
RNA interference pathway. Cell 107, 309-321.) Other modifications which prevent 
phosphorylation can also be used, e.g., shnply substituting the 5'-OH by H rather than O-Me. 
Alternatively, a large bulky group may be added to the 5 '-phosphate turning it into a 
phosphodiester linkage, though this may be less deskable as phosphodiesterases can cleave 
such a linkage and release a functional sRNA 5'-end. Antisense strand modifications include 
5' phosphorylation as well as any of the other 5' modifications discussed herem, particularly 
the 5' modifications discussed above in the section on single stranded iRNA molecules. 

It is preferred that the sense and antisense strands be chosen such that the ds iRNA 
agent includes a single strand or unpaired region at one or both ends of the molecule. Thus, i 
ds iRNA agent contains sense and antisense strands, preferable paired to contain an 
overhang, e.g., one or two 5' or 3' overhangs but preferably a 3' overhang of 2-3 
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nucleotides. Most embodiments will have a 3' overhang. Preferred sKNA agents will have 
single-stranded overhangs, preferably 3' overhangs, of 1 or preferably 2 or 3 nucleotides in 
length at each end. The overhangs can be the result of one strand being longer than the other, 
or the resiilt of two strands of the same lengtli being staggered. 5* ends are preferably 
phosphorylated. 

Preferred lengths for the duplexed region is between 15 and 30, most preferably 18, 
19, 20,21, 22, and 23 nucleotides in length, e.g., in the sRNA agent range discussed above. 
sRNA agents can resemble in length and structure the natural Dicer processed products from 
long dsRNAs, Embodiments in which the two strands of the sRNA agent are linked, e.g., 
covalently linked are also included. Hairpin, or other single strand structwes which provide 
the required double stranded region, and preferably a 3' overhang are also within the 
invention. 

The isolated iRNA agents described herein, including ds iRNA agents and sRNA 
agents can mediate silencing of a target RNA, e.g., mRNA, e.g., a transcript of a gene that 
encodes a protein. For convenience, such mRNA is also referred to herein as mRNA to be 
silenced. Such a gene is also referred to as a target gene. In general, the RNA to be silenced 
is an endogenous gene or a pathogen gene. In addition, RNAs other than mRNA, e.g., 
tRNAs, and viral RNAs, can also be targeted. 

As used herein, the phrase "mediates RNAi" refers to the ability to silence, in a 
sequence specific manner, a target RNA. While not wishing to be bound by theory, it is 
believed that silencing uses the RNAi machinery or process and a guide RNA, e.g., an sRNA 
agent of 21 to 23 nucleotides. 

As used herein, "specifically hybridizable" and "complementary" are teims which are 
used to indicate a sufficient degree of complementarity such that stable and specific binding 
occurs between a compound of the invention and a target RNA molecule. Specific binding 
requires a sufficient degree of complementarity to avoid non-specific binding of the 
oligomeric compound to non-target sequences under conditions in which specific binding is 
desired, Le., under physiological conditions in the case of in vivo assays or therapeutic 
treatment, or in tlie case of in vitro assays, under conditions in which the assays are 
performed. The non-target sequences typically differ by at least 5 nucleotides. 
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In one embodiment, an iRNA agent is "sufficiently complementary" to a target RNA, 
e.g., a target mRNA, such that the iRNA agent silences production of protein encoded by the 
target mRNA. In another embodiment, the iRNA agent is "exactly complementary" 
(excluding tlie RRMS containing subunit(s))to a target RNA, e.g., the target RNA and the 
iRNA agent anneal, preferably to form a hybrid made exclusively of Watson-Crick basepairs 
in the region of exact complementarity. A "sufficiently complementary" target RNA can 
include an internal region (e.g., of at least 10 nucleotides) that is exactly complementary to a 
target RNA. Moreover, in some embodiments, the iKMA agent specifically disciiminates a 
single-nucleotide difference. In this case, the iRNA agent only mediates RNAi if exact 
complementary is found in the region (e.g., within 7 nucleotides of) the single-nucleotide 
difference. 

As used herein, the term "oligonucleotide" refers to a nucleic acid molecule (RNA or 
DNA) preferably of length less than 100, 200, 300, or 400 nucleotides. 

RNA agents discussed herein include otherwise unmodified RNA as well as RNA 
which have been modified, e.g., to improve efficacy, and polymers of nucleoside sur rogates. 
Unmodified RNA refers to a molecule in which the components of tlie nucleic acid, namely 
sugars, bases, and phosphate moieties, are the same or essentially the same as that which 
occur in nature, preferably as occur naturally in the human body. The art has referred to rare 
or unusual, but naturally occvirring, RNAs as modified RNAs, see, e.g., Limbach et al, 
(1994) Summary: the modified nucleosides of RNA, Nucleic Acids Res. 22: 2183-2196. 
Such rare or unusual RNAs, often termed modified RNAs (apparently because the are 
typically the result of a post transcriptionally modification) are within tlie term unmodified 
RNA, as used herein. Modified RNA as used herein refers to a molecule in which one or 
more of the components of the nucleic acid, namely sugars, bases, and phosphate moieties, 
are different from tliat which occur m nature, preferably different from that which occurs in 
the hmnan body. While they are referred to as modified "RNAs," they will of course, 
because of the modification, include molecules which are not ElNAs. Nucleoside surrogates 
aie molecules in which tlie ribophosphate backbone is replaced with a non-ribophosphate 
construct that allows the bases to tlie presented in the correct spatial relationship such that 
hybridization is substantially similar to what is seen with a ribophosphate backbone, e.g.. 
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non-charged mimics of the ribophosphate backbone. Examples of all of the above are 
discxissed herein. 

Much of the discussion below refers to single strand molecules. In many 
embodiments of the invention a double stranded iRNA agent, e.g., a partially double stranded 
iRNA agents is required or preferred. Thus, it is understood that that double stranded 
structures (e.g. where two separate molecules are contacted to form the double stranded 
region or where the double stranded region is formed by intramolecular pairing (e.g., a 
haii-pin structure)) made of the single stranded structures described below are within the 
invention. Preferred lengths are described elsewhere herein. 

As nucleic acids are polymers of subunits or monomers, many of the modifications 
described below occur at a position which is repeated within a nucleic acid, e.g., a 
modification of a base, or a phosphate moiety, or the a non-linking O of a phosphate moiety. 
In some cases the modification will occur at all of the subject positions in the nucleic acid but 
in many, and infact in most cases it will not. By way of example, a modification may only 
occur at a 3' or 5' terminal position, may only occur m a terminal regions, e.g. at a position 
on a terminal nucleotide or in the last 2, 3, 4, 5, or 10 nucleotides of a sti^and. A modification 
may occur in a double strand region, a single strand region, or in both. A modification may 
occur only in the double strand region of an RNA or may only occur in a single strand region 
of an RNA, E.g., a phosphorothioate modification at a non-linking O position may only 
occur at one or both termini, may only occur in a terminal regions, e.g., at a position on a 
terminal nucleotide or in the last 2, 3, 4, 5, or 10 nucleotides of a strand, or may occur in 
double strand and single strand regions, particularly at termini. The 5' end or ends can be 
phosphorylated. 

In some embodiments it is particularly preferred, e.g., to enhance stability, to include 
particular bases in overhangs, or to include modified nucleotides or nucleotide surrogates, in 
single strand overhangs, e.g., in a 5' or 3' overhang, or in both. E.g., it can be desirable to 
include purine nucleotides in overhangs. In some embodiments all or some of the bases in a 
3' or 5' overhang will be modified, e.g., with a modification described herein. Modifications 
can include, e.g., the use of modifications at the 2' OH group of the ribose sugar, e.g., the use 
of deoxyribonucleotides, e.g., deoxythyniidine, instead of ribonucleotides, and modifications 
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in the phosphate group, e.g., phosphothioate modifications. Overhangs need not be 
homologous with the target sequence. 

Modifications and nucleotide surrogates are discussed below. 




(2' OH) 



(2' OH) 



5 FORMULA 1 

The scaffold presented above in Formula 1 represents a portion of a ribonucleic acid. 
The basic components are tlie ribose sugar, the base, the teiminal phosphates, and phosphate 
10 internucleotide linkers. Where the bases are naturally occurring bases, e.g., adenine, uracil, 
guanine or cytosine, the sugars ai'e the unmodified T hydroxyl ribose sugar (as depicted) and 
W, X, Y, and Z are all O, Formula 1 represents a naturally occxirring unmodified 
oligoribonucleotide . 

Unmodified oligoribonucleotides may be less than optimal in some appUcations, e.g., 
15 umnodified oUgoribonucleotides can be prone to degradation by e.g., cellular nucleases. 
Nucleases can hydrolyze nucleic acid phosphodiester bonds. However, chemical 
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modifications to one or more of the above RNA components can confer improved properties, 
and, e.g., can render oligoribonucleotides more stable to nucleases. Umodified 
oligoribonucleotides may also be less than optimal in terms of offering tethering points for 
attaching ligands or other moieties to an iRNA agent. 

Modified nucleic acids and nucleotide surrogates can include one or more of: 

(i) alteration, e.g., replacement, of one or both of tlie non-linking (X and Y) 
phosphate oxygens and/or of one or more of the linking (W and Z) phosphate oxygens 
(When the phosphate is in the terminal position, one of the positions W or Z will not link the 
phosphate to an additional element in a naturally occurring ribonucleic acid. However, for 
simplicity of terminology, except where otherwise noted, the W position at the 5' end of a 
nucleic acid and the terminal Z position at the 3' end of a nucleic acid, are within the term 
"Ihiking phosphate oxygens" as used herein.); 

(ii) alteration, e.g., replacement, of a constituent of tlie ribose sugar, e.g., of the 2' 
hydroxyl on the ribose sugar, or wholesale replacement of the ribose sugar with a structure 
other than ribose, e.g., as described herein; 

(iii) wholesale replacement of the phosphate moiety (bracket I) with "dephospho" 

linkers; 

(iv) modification or replacement of a naturally occurring base; 

(v) replacement or modification of the ribose-phosphate backbone (bracket II); 

(vi) modification of the 3' end or 5' end of the RNA, e.g., removal, modification or 
replacement of a terminal phosphate group or conjugation of a moiety, e.g. a fluorescently 
labeled moiety, to either the 3' or 5' end of RNA. 

The terms replacement, modification, alteration, and the like, as used in this context, 
do not imply any process limitation, e.g., modification does not mean that one must start with 
a reference or naturally occurring ribonucleic acid and modify it to produce a modified 
ribonucleic acid bur rather modified simply indicates a difference from a naturally occurring 
molecule. 

It is understood tliat the actual electronic structure of some chemical entities camiot 
be adequately represented by only one canonical form (/.e. Lewis structure). While not 
wishing to be bound by theory, tlie actual structure can instead be some hybrid or weighted 
average of two or more canonical forms, known collectively as resonance forms or 
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Structures. Resonance structures are not discrete chemical entities and exist only on paper. 
They differ from one another only in the placement or "localization" of the bonding and 
nonbonding electrons for a particular chemical entity. It can be possible for one resonance 
structure to contribute to a greater extent to the hybrid than the others. Thus, the written and 
graphical descriptions of the embodiments of the present invention are made in terms of what 
the art recognizes as the predominant resonance form for a particular species. For example, 
any phosphoroamidate (replacement of a nonlinking oxygen with nitrogen) would be 
represented by X = O and Y = N in the above figure. 

Specific modifications are discussed in more detail below. 

The Phosphate Group 

The phosphate group is a negatively charged species. The charge is distributed 
equally over the two non-linking oxygen atoms (i,e., X and Y in Formula 1 above). However, 
the phosphate group can be modified by replacing one of the oxygens with a different 
substituent. One resvilt of this modification to RNA phosphate backbones can be increased 
resistance of the oligoribonucleotide to nucleolytic breakdown. Thus while not wishing to be 
boimd by theory, it can be desirable in some embodiments to introduce alterations which 
result in either an uncharged linker or a charged linker with unsymmetrical charge 
distribution. 

Examples of modified phosphate groups include phosphorothioate, 
phosphoroselenates, borano phosphates, borano phosphate esters, hydrogen phosphonates, 
phosphoroamidates, alkyl or aryl phosphonates aiid phosphotriesters. Phosphorodithioates 
have both non-linlcing oxygens replaced by sulfur. Unlike the situation where only one of X 
or Y is altered, the phosphorus center in the phosphorodithioates is achiral which precludes 
tlie formation of oligoribonucleotides diastereomers. Diastereomer formation can result in a 
preparation in which the individual diastereomers exhibit varying resistance to nucleases. 
Further, the hybridization affinity of RNA containing chiral phosphate groups can be lower 
relative to the corresponding unmodified RNA species. Thus, while not wishing to be bound 
by theory, modifications to both X and Y which eliminate the chiral center, e.g. 
phosphorodithioate formation, may be desirable m that they cannot produce diastereomer 
mixtures. Thus, X can be any one of S, Se, B, C, H, N, or OR (R is alkyl or aryl). Thus Y 
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can be any one of Se, B, C, H, N, or OR (R is alkyl or aryl). Replacement of X and/or Y 

with sulfur is preferred. 

The phosphate linker can also be modified by replacement of a linking oxygen {Le., 
W or Z in Formula 1) with nitrogen (bridged phosphoroamidates), sulfur (bridged 
phosphorothioates) and carbon (bridged methylenepliosphonates). The replacement can 
occur at a terminal oxygen (position W (30 or position Z (50- Replacement of W with 
carbon or Z with nitrogen is preferred. 

Candidate agents can be evaluated for suitability as described below. 

The Sugar Group 

A modified RNA can include modification of all or some of the sugar groups of the 
ribonucleic acid. E,g., the 2' hydroxyl group (OH) can be modified or replaced with a 
number of different "oxy" or "deoxy" substituents. While not being bound by theory, 
enhanced stability is expected since the hydroxyl can no longer be deprotonated to form a 2' 
alkoxide ion. The 2' alkoxide can catalyze degradation by intramolecular nucleophilic attack 
on the linker phosphorus atom. Again, while not wishing to be bound by theory, it can be 
desirable to some embodiments to introduce alterations in which alkoxide formation at the 2' 

position is not possible.' 

Examples of "oxy"-2' hydroxyl group modifications include alkoxy or aryloxy (OR, 
e,g., R = H, alkyl, cycloalkyl, aryl, aralkyl, heteroaryl or sugai'); polyethyleneglycols (PEG), 
0(CH2CH20)nCH2CH20R; "locked" nucleic acids (LNA) in which the 2' hydroxyl is 
connected, e.g,, by a methylene bridge, to the 4' carbon of the same ribose sugar; O-AMINE 
(AMINE = NH2; alkylamino, dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl 
amino, or diheteroaryl amino, ethylene diamine, polyamino) and aminoalkoxy, 
0(CH2)nAMINE, (e.g., AMINE = NH2; alkylamino, dialkylamino, heterocyclyl, arylamino, 
diaryl amino, heteroaryl amino, or diheteroaryl amino, ethylene diamine, polyamino). It is 
noteworthy that oUgonucleotides containing only the methoxyetliyl group (MOE), 
(OCH2CH2OCH3, a PEG derivative), exhibit nuclease stabilities comparable to those 
modified with the robust phosphorothioate modification. 

"Deoxy" modifications include hydrogen {Le. deoxyribose sugars, which are of 
particular relevance to the overhang portions of partially ds RNA); halo (e.g., fluoro); amino 
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(e.g. NH2; alkylamino, dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl 
amino, dilieteroaryl amino, or amino acid); NH(CH2CH2NH)nCH2CH2-AMINE (AMINE = 
NH2; alkylamino, dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl amino,or 
diheteroaryl amino), -NHC(0)R (R = alkyl, cycloalkyl, aryl, aralkyl, heteroaryl or sugai'), 
cyano; mercapto; alkyl-thio-alkyl; thioalkoxy; and alkyl, cycloalkyl, aryl, alkenyl and 
alkynyl, which may be optionally substituted with e,g,, an amino functionality. Preferred 
substitutents are 2'-methoxyethyl, 2'-OCH3, 2'-0-allyl, 2'-C- allyl, and 2'-fluoro. 

The sugar group can also contain one or more carbons that possess the opposite 
stereochemical configuration than that of the corresponding carbon in ribose. Thus, a 
modified RNA can include nucleotides containing e.g., arabinose, as the sugar. 

Modified RNA's can also include "abasic" sugars, which lack a nucleobase at C-T. 
These abasic sugars can also be further contain modifications at one or more of the 

constituent sugar atoms. 

To maximize nuclease resistance, the 2' modifications can be used m combination 
with one or more phosphate linker modifications (e.g., phosphorothioate). The so-called 
"chimeric" oligonucleotides are those that contain two or more different modifications. 

The modificaton can also entail the wholesale replacement of a ribose structure with 
another entity at one or more sites in the iKNA agent. These modifications are described in 
section entitled Ribose Replacements for RRMSs. 

Candidate modifications can be evaluated as described below. 

Replacement of the Phosphate Group 

The phosphate group can be replaced by non-phosphorus containing cormectors (c/ 
Bracket I in Formula 1 above). While not wishmg to be bound by theory, it is believed that 
since the charged phosphodiester group is the reaction center m nucleolytic degradation, its 
replacement with neutral structural mimics should impart enhanced nuclease stability. 
Again, while not wishing to be bound by theory, it can be desirable, in some embodiment, to 
introduce alterations in which the charged phosphate group is replaced by a neutral moiety. 

Examples of moieties which can replace the phosphate group include siloxane, 
carbonate, caiboxymethyl, carbamate, amide, thioether, ethylene oxide linlcer, sulfonate, 
sulfonamide, thiofonnacetal, formacetal, oxime, methyleneimino, methylenemetliylimino, 



71 



wo 2004/080406 



PCT/US2004/007070 



methylenehydrazo, methylenedimethylhydrazo and methyleneoxymethylimino. Preferred 
replacements include the methylenecarbonylamino and methylenemethylimino groups. 
Candidate modifications can be evaluated as described below. 

Replacement of Ribophosphate Backbone 

Oligonucleotide- mimicking scaffolds can also be constructed wherein the phosphate 
linker and ribose sugar are replaced by nuclease resistant nucleoside or nucleotide surrogates 
(see Bracket II of Foimula 1 above). While not wishing to be bound by theory, it is beUeved 
that the absence of a repetitively charged backbone diminishes binding to proteins that 
recognize polyanions (e.g. nucleases). Again, while not wishing to be bound by theory, it 
can be desirable in some embodiment, to introduce alterations in which the bases are tethered 

by a neutral surrogate backbone. 

Examples include the mophilino, cyclobutyl, pyrrolidine and peptide nucleic acid 
(PNA) nucleoside surrogates. A preferred surrogate is a PNA surrogate. 

Candidate modifications can be evaluated as described below. 

Terminal Modifications 

The y and 5' ends of an oligonucleotide can be modified. Such modifications can be 
at the 3' end, 5' end or both ends of the molecule. They can include modification or 
replacement of an enture terminal phosphate or of one or more of the atoms of the phosphate 
group. E.g,, the 3' and 5' ends of an oUgonucleotide can be conjugated to other functional 
molecular entities such as labeUng moieties, e,g,, fluorophores (e.g., pyrene, TAMRA, 
fluorescein, Cy3 or Cy5 dyes) or protecting groups (based e.g., on sulfur, silicon, boron or 
ester). The functional molecular entities can be attached to the sugar through a phosphate 
group and/or a spacer. The terminal atom of the spacer can connect to or replace the linking 
atom of the phosphate gi'oup or the C-3' or C-5' O, N, S or C group of the sugar. 
Alternatively, the spacer can connect to or replace the teiminal atom of a nucleotide 
surrogate (e.g., PNAs). These spacers or linkers can include e.g., -(CH2)n-, -(CH2)nN-, - 
(CH2)nO-, -(CH2)nS-, 0(CH2CH20)nCH2CH20H (e.g., n = 3 or 6), abasic sugars, amide, 
carboxy, amine, oxyamine, oxyimine, thioether, disulfide, thiourea, sulfonamide, or 
moipholino, or biotin and fluorescein reagents. When a spacer/phosphate-functional 

molecular entity-spacer/phosphate array is interposed between two strands of iRNA agents, 
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this array can substitute for a hairpin RNA loop m a hairpin-type RNA agent. The 3 ' end can 
be an -OH group. While not wishing to be bound by theory, it is believed that conjugation of 
certain moieties can improve transport, hybridization, and specificity properties. Again, 
while not wishing to be bound by theory, it may be desirable to introduce terminal alterations 

5 that improve nuclease resistance. Other examples of terminal modifications include dyes, 
intercalating agents (e.g, acridines), cross-linkers (e.g, psoralene, mitomycin C), porphyrins 
(TPPC4, texaphyrin, Sapphyrin), polycycUc aromatic hydrocarbons (e.g., phenazine, 
dihydrophenazine), artificial endonucleases (e.g. EDTA), lipophiUc carriers (e.g., cholesterol, 
cholic acid, adamantane acetic acid, 1-pyrene butyric acid, dihydrotestosterone, 1,3-Bis- 

10 0(hexadecyl)glycerol, geranyloxyhexyl group, hexadecylglycerol, borneol, menthol, 1,3- 
propanediol, heptadecyl group, pahnitic acid, myristic acid,03-(oleoyl)lithochoUc acid, 03- 
(oleoyl)cholenic acid, dimethoxytrityl, or phenoxazine)and peptide conjugates (e.g., 
antennapedia peptide, Tat peptide), alkylating agents, phosphate, amino, mercapto, PEG 
(e.g., PEG-40K), MPEG, [MPEG]2, polyamino, alkyl, substituted alkyl, radiolabeled 

15 markers, enzymes, haptens (e.g. biotin), transport/absorption facilitators (e.g., aspirin, 
vitamin E, folic acid), synthetic ribonucleases (e.g., unidazole, bisimidazole, histamine, 
imidazole clusters, acridine-imidazole conjugates, Eu3+- complexes of tetraazamacrocycles). 

Terminal modifications can be added for a number of reasons, including as discussed 
elsewhere herein to modulate activity or to modulate resistance to degradation. Terminal 

20 modifications usefiil for modulating activity include modification of the 5' end with 

phosphate or phosphate analogs. Kg., in preferred embodiments iRNA agents, especially 
antisense strands, ai^e 5' phosphorylated or include a phosphoryl analog at the 5' prime 
terminus. 5 -phosphate modifications include those which are compatible witli RISC 
mediated gene silencing. Suitable modifications include: 5 '-monophosphate ((H0)2(0)P-0- 

25 5'); 5 '-diphosphate ((H0)2(0)P-0-P(H0)(0)-0-5'); 5'-triphosphate ((H0)2(0)P-0- 

(H0)(0)P-0-P(H0)(0)-0-5'); 5'-guanosine cap (7-methylated or non-methylated) (7m-G-0- 
5'-(HO)(O)P-O-(HO)(O)P-O-P(HO)(O)-O-50; 5*-adenosine cap (Appp), and any modified or 
umnodified nucleotide cap structure (N~0-5'-(HO)(0)P-0-(HO)(0)P-0-P(HO)(0)-0-5'); 5»- 
monothiophosphate (phosphorothioate; (HO)2(S)P-0-5'); 5'-monodithiophosphate 

30 (phosphorodithioate; (H0)(HS)(S)P-0-5'), 5'-phosphorothiolate ((H0)2(0)P-S-5'); any 
additional combmation of oxgen/sulfur replaced monophosphate, diphosphate and 
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triphosphates (e.g. 5 -alpha-thiotriphosphate, 5'-gamma-thiotriphosphate, etc.), 5'- 
phosphoramidates ((HO)2(0)P-NH-5', (HO)(NH2)(0)P-0-5'), 5'-alkylphosphonates 
(R=alkyl=methyl, ethyl, isopropyl, propyl, etc., e.g. RP(0H)(0)-0-5'-, (OH)2(0)P-5'-CH2-), 
5'-alkyletherphosphonates (R=alkylether=methoxymethyl (MeOCH2-), ethoxymethyl, etc, 
5 e.g, RP(0H)(0)-0-5*-). 

Terminal modifications useful for increasing resistance to degradation include 
Terminal modifications can also be useful for monitoring distribution, and in such 
cases the preferred groups to be added include fluorophores, e.g., fluorscein or an Alexa dye, 
e.g., Alexa 488. Terminal modifications can also be useftil for enhancing uptake, usefiil 
10 modifications for this include cholesterol. Temiinal modifications can also be useful for 
cross-lhiking an RNA agent to another moiety; modifications useful for this include 
mitomycin C. 

Candidate modifications can be evaluated as described below. 
The Bases 

15 Adenine, guanine, cytosine and uracil are the most common bases found in RNA. 

These bases can be modified or replaced to provide RNA's having improved propeities. 
E,g,, nuclease resistant oligoribonucleotides can be prepared with these bases or with 
synthetic and natural nucleobases {e.g., inosine, thymine, xanthine, hypoxanthine, 
nubularine, isoguanisine, or tubercidine) and any one of the above modifications. 

20 Alternatively, substituted or modified analogs of any of the above bases, e.g., "imusual 
bases" and "imiversal bases," can be employed. Examples include without limitation 2- 
aminoadenine, 6-methyl and other alkyl derivatives of adenine and guanine, 2-propyl and 
other alkyl derivatives of adenine and guanine, 5-halouracil and cytosine, 5-propynyl uracil 
and cytosine, 6-azo uracil, cytosine and thymine, 5-uracil (pseudouracil), 4-thiouracil, 5- 

25 halouracil, 5-(2-aminopropyl)uracil, 5-amino allyl uracil, 8-halo, amino, thiol, thioalkyl, 
hydroxyl and other 8-substituted adenines and guanines, 5-trifluoromethyl and other 5- 
substituted uracils and cytosines, 7-methylguanine, 5-substituted pyrimidines, 6- 
azapyrimidines and N-2, N-6 and 0-6 substituted purines, including 2-aminopropyladenine, 
5-propynyluracil and 5-propynylcytosine, dihydrouracil, 3-deaza-5-azacytosine, 2- 

30 aminopurine, 5-alkyluracil, 7-alkylguanine, 5-alkyl cytosine,7-deazaadenine, N6, N6- 
dimethyladenine, 2,6-diaminopurine, 5-amino-allyl-uracil, N3-methyluracil, substituted 
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1,2,4-triazoles, 2-pyridinone, 5-nitroindole, 3-nitropyrrole, 5-methoxyuracil, uracil-5- 
oxyacetic acid, 5-methoxycarbonylmethyliiracil, 5-inethyl-2-thiouracil, 5- 
methoxycarbonylmethyl-2-thiouracil, 5 -methylammomethyl-2-thiouracil, 3-(3 -amino- 
3carboxypropyl)uracil, 3-methylcytosine, 5-methylcytosine, N'^-acetyl cytosine, 2- 
thiocytosine, N6-inethyladenine, N6-isopentyladenine, 2-methylthio-N6-isopentenyladenine, 
N-methylguanines, or O-alkylated bases. Further purines and pyrimidines include those 
disclosed in U.S. Pat. No. 3,687,808, those disclosed in the Concise Encyclopedia Of 
Polymer Science And Engineering, pages 858-859, Kroschwitz, J. I., ed, John Wiley & Sons, 
1990, and those disclosed by EngUsch et aL, Angewandte Chemie, International Edition, 
1991,30,613. 

Generally, base changes are less preferred for promotmg stability, but they can be 
useful for other reasons, e.g., some, e.g., 2,6-diaminopurine and 2 amino purine, are 
fluorescent. Modified bases can reduce target specificity. This should be taken into 
consideration in the design of iRNA agents. 

Candidate modifications can be evaluated as described below. 

Evaluation of Candidate RNA's 

One can evaluate a candidate RNA agent, e.g., a modified RNA, for a selected 
property by exposing the agent or modified molecule and a control molecule to the 
appropriate conditions and evaluating for the presence of the selected property. For example, 
resistance to a degradent can be evaluated as follows. A candidate modified RNA (and 
preferably a control molecule, usually the unmodified form) can be exposed to degradative 
conditions, e.g., exposed to a miUeu, which includes a degradative agent, e.g., a nuclease. 
E.g., one can use a biological sample, e.g., one that is similar to a miUeu, which might be 
encountered, in therapeutic use, e.g., blood or a cellular fraction, e.g., a cell-firee homogenate 
or disrupted cells. The candidate and control could then be evaluated for resistance to 
degradation by any of a number of approaches. For example, the candidate and control could 
be labeled, preferably prior to exposure, with, e.g., a radioactive or enzymatic label, or a 
fluorescent label, such as Cy3 or Cy5. Control and modified RNA's can be incubated with 
the degradative agent, and optionally a control, e.g., an inactivated, e.g., heat inactivated, 
degradative agent. A physical parameter, e.g., size, of the modified and control molecules 
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are then detennined. They caii be determined by a physical method, e.g-., by polyacrylamide 
gel electrophoresis or a sizing colxmin, to assess whether the molecule has maintained its 
original length, or assessed functionally. Alternatively, Northern blot analysis can be used to 
assay the length of an unlabeled modified molecule. 

5 A ftmctional assay can also be used to evaluate the candidate agent. A fimctional 

assay can be applied mitially or after an earlier non-functional assay, (e.g-., assay for 
resistance to degradation) to determine if the modification alters the ability of the molecule to 
silence gene expression. For example, a cell, e,g,, a mammalian cell, such as a mouse or 
human cell, can be co-transfected with a plasmid expressing a fluorescent protein, e,g,, GFP, 

10 and a candidate RNA agent homologous to the transcript encoding the fluorescent protein 
(see, e.g,, WO 00/44914). For example, a modified dsRNA homologous to the GFP mRNA 
can be assayed for the ability to inhibit GFP expression by monitoring for a decrease in cell 
fluorescence, as compared to a control cell, in which the transfection did not mclude the 
candidate dsRNA, e,g,, controls with no agent added and/or controls with a non-modified 

15 RNA added. Efficacy of the candidate agent on gene expression can be assessed by 

comparing cell fluorescence in the presence of the modified and unmodified dsRNA agents. 

In an alternative functional assay, a candidate dsRNA agent homologous to an 
endogenous mouse gene, preferably a matemally expressed gene, such as c-mos, can be 
injected into an immature mouse oocyte to assess the ability of the agent to inhibit gene 

20 expression in vivo (see, e.g., WO 01/36646). A phenotype of the oocyte, e.g., the ability to 
maintain arrest in metaphase II, can be monitored as an indicator that the agent is inhibiting 
expression. For example, cleavage of c-inos mRNA by a dsRNA agent would cause the 
oocyte to exit metaphase arrest and initiate parthenogenetic development (Colledge et al. 
Nature 370: 65-68, 1994; Hashimoto et al Nature, 370:68-71, 1994). The effect of the 

25 modified agent on target RNA levels can be verified by Northern blot to assay for a decrease 
in the level of target mRNA, or by Western blot to assay for a decrease in the level of target 
protein, as compared to a negative control. Controls can include cells in which v^^th no agent 
is added and/or cells in which a non-modified RNA is added. 
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References 

General References 

The oligoribonucleotides and oligoribonucleosides used in accordance with this 
invention may be with solid phase synthesis, see for example "Oligonucleotide synthesis, a 

5 practical approach", Ed. M. J. Gait, IRL Press, 1984; "Oligonucleotides and Analogues, A 
Practical Approach", Ed. F. Eckstein3 IRL Press, 1991 (especially Chapter 1, Modem 
machine-aided methods of oligodeoxyribonucleotide synthesis. Chapter 2, 
Oligoribonucleotide synthesis. Chapter 3, 2'-0--MethyloUgoribonucleotide- s: synthesis and 
applications, Chapter 4, Phosphorothioate oligonucleotides. Chapter 5, Synthesis of 

10 oUgonucleotide phosphorodithioates. Chapter 6, Synthesis of oUgo-2 -deoxyribonucleoside 
methylphosphonates, and. Chapter 7, OUgodeoxynucleotides containing modified bases. 
Other particularly useful synthetic procedures, reagents, blocking groups and reaction 
conditions are described in Martin, P., Helv. Chim. Acta, 1995, 75, 486-504; Beaucage, S. L. 
and Iyer, R. P., Tetrahedron, 1992, 48, 2223-2311 and Beaucage, S. L, and Iyer, R. P., 

1 5 Tetrahedron, 1993, 49, 6 123-6 1 94, or references referred to therein. 

Modification described in WO 00/44895, WOOl/75164, or WO02/44321 can be used 

herein. 

The disclosure of all pubUcations, patents, and pubUshed patent appUcations listed 
herem are hereby incorporated by reference. 

20 Phosphate Group References 

The preparation of phosphmate oligoribonucleotides is described in U.S. Pat. No. 
5,508,270. The preparation of alkyl phosphonate oUgoribonucleotides is described in U.S. 
Pat. No. 4,469,863. The preparation of phosphoramidite oligoribonucleotides is described in 
U.S. Pat. No. 5,256,775 or U.S. Pat. No. 5,366,878. The preparation of phosphotriester 

25 oUgoribonucleotides is described in U.S. Pat. No. 5,023,243. The preparation of borano 

phosphate oligoribonucleotide is described mU.S. Pat. Nos. 5,130,302 and 5,177,198. The 
preparation of 3^-Deoxy-3'-amino phosphoramidate oligoribonucleotides is described in U.S. 
Pat. No, 5,476,925. 3'-Deoxy-3'-methylenephosphonate oligoribonucleotides is described in 
An, H, et al J. Org. Chem, 2001, 66, 2789-2801. Preparation of sulfur bridged nucleotides is 
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described in Sproat et al Nucleosides Nucleotides 1988, 7,651 and Crosstick et al 
Tetrahedron Lett. 1989, 30, 4693. 

Sugar Group References 

Modifications to the T modifications can be found in Verma, S. et aL Annu. Rev. 
Biochem, 1998, 67, 99-134 and all references therein. Specific modifications to the ribose 
can be found in the foUov/ing references: 2'-fluoro (Kawasaki et. al., J. Med. Chem., 1993, 
36, 831-841), 2'-MOE (Martin, P. Helv. Chim. Acta 1996, 79, 1930-1938), "LNA" (Wengel, 
J. Acc. Chem. Res. 1999, 32, 301-310). 

Replacement of the Phosphate Group References 

Methylenemethylimino linked oligoribonucleosides, also identified herein as MMI 
linked oligoribonucleosides, metliylenedimethylhydrazo linked oligoribonucleosides, also 
identified herein as MDH linked oligoribonucleosides, and methylenecarbonylamino linked 
oligonucleosides, also identified herein as amide-3 linked oligoribonucleosides, and 
metliyleneaminocarbonyl linked oligonucleosides, also identified herein as amide-4 linked 
oligoribonucleosides as well as mixed backbone compoimds having, as for instance, 
alternating MMI and PO or PS linkages can be prepared as is described in U.S. Pat. Nos. 
5,378,825, 5,386,023, 5,489,677 and in published PCT applications PCT/US 92/04294 and 
PCT/US92/04305 (published as WO 92/20822 WO and 92/20823, respectively). Formacetal 
and thioformacetal linked oligoribonucleosides can be prepared as is described in U.S. Pat. 
Nos. 5,264,562 and 5,264,564. Ethylene oxide linked oligoribonucleosides can be prepared 
as is described in U.S. Pat. No. 5,223,618. Siloxane replacements are described in 
Cormier, J.F. et al. Nucleic Acids Res. 1988, 16, 4583. Carbonate replacements are described 
in Tittensor, J.R. J. Chem. Soc. C 1971, 1933. Carboxymethyl replacements are described in 
Edge, M.D. et aL J. Chem. Soc. Perkin Trans. 1 1972, 1991. Carbamate replacements are 
described in Stirchak, E.P. Nucleic Acids Res. 1989, 17, 6129. 

Replacement of the Phosphate-Ribose Backbone References 

Cyclobutyl sugar surrogate compounds can be prepared as is described in U.S. Pat. 
No. 5,359,044. Pyrrolidine sugar surrogate can be prepared as is described in U.S. Pat. No. 
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5,519,134. Morpholino sugar surrogates can be prepared as is described in U.S. Pat. Nos. 
5,142,047 and 5,235,033, and other related patent disclosures. Peptide Nucleic Acids (PNAs) 
are known per se and can be prepared in accordance with any of the various procedures 
referred to in Peptide Nucleic Acids (PNA): Synthesis, Properties and Potential Applications, 
Bioorganic & Medicinal Chemistry, 1996, 4, 5-23. They may also be prepared in accordance 
with U.S. Pat. No. 5,539,083. 

Terminal Modification References 

Terminal modifications are described in Manoharan, M. et al. Antisense and Nucleic 
Acid Drug Development 12, 103-128 (2002) and references therein. 

Bases References 

N-2 substitued pxirine nucleoside amidites can be prepared as is described in U.S. Pat. 
No. 5,459,255. 3-Deaza purine nucleoside amidites can be prepared as is described in U.S. 
Pat. No. 5,457,191. 5,6-Substitutedpyrimidine nucleoside amidites can be prepared as is 
described in U.S. Pat. No. 5,614,617. 5-Propynyl pyrimidine nucleoside amidites can be 
prepared as is described in U.S. Pat. No. 5,484,908. Additional references can be disclosed 
in the above section on base modifications. 
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Preferred iRNA Agents 



Preferred RNA agents have the following structure (see Formula 2 below): 




FORMULA 2 



Referring to Formula 2 above, R^ R^, and are each, independently, H, {Le. abasic 
nucleotides), adenine, guanine, cytosine and uracil, inosine, thymine, xanthine, 
10 hypoxanthine, nubularine, tubercidine, isoguanisine, 2-aminoadenine, 6-methyl and other 

alkyl derivatives of adenine and guanine, 2-propyl and other alkyl derivatives of adenine and 
guanine, 5-halouracil and cytosine, 5-propynyl uracil and cytosine, 6-azo uracil, cytosine and 
thymine, 5-uracil (pseudouracil)^ 4-thiouracil, 5-halouracil, 5-(2-aminopropyl)uracil, 5-amino 
allyl uracil, 8-halo, amino, thiol, thioalkyl, hydroxyl and other 8-substituted adenines and 
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guanines, 5-trifluoromethyl and other 5-substituted uracils and cytosines, 7-methylguanine, 
5-substituted pyrimidines, 6-azapyriniidines and N-2, N-6 and 0-6 substituted purines, 
including 2-aminopropyladenine, 5-propynyluracil and 5-propynylcytosine, dihydrouracil, 3- 
deaza-5-azacytosine, 2-aminopurine, 5-alkyluracil, 7-alkylguanine, 5-alkyl cytosine,?- 

5 deazaadenine, 7-deazaguanine, N6, N6-dimethyladenine, 2,6-diaminopurine, 5-amino-allyl- 
uracil, N3-methyluracil, substituted 1,2,4-triazoles, 2-pyridinone, 5-nitroindole, 3- 
nitropyrrole, 5-methoxyuracil, uracil-5-oxyacetic acid, 5-methoxycarbonylmethyluracil, 5- 
methyl-2-thiouracil, 5-methoxycarbonylmethyl-2-thiouracil, 5-methylaminoniethyl-2- 
thiouracil, 3-(3-amino-3carboxypropyl)uracil, 3-methylcytosine, 5-methylcytosine, N'^-acetyl 

1 0 cytosine, 2-thiocytosine, N6-methyladenine, N6-isopentyladenine, 2-methylthio-N6- 
isopentenyladenine, N-methylguanines, or O-alkylated bases. 

R^ R^ and are each, independently, OR^ 0(CH2CH20)„iCH2CH20R^ 
0(CH2)nR'; 0(CH2)nOR^ H; halo; NH2; NHR^; N(R')2; NH(CH2CH2NH)^CH2CH2NHR^; 
NHC(0)R^ ; cyano; mercapto, SR^ alkyl-thio-alkyl; alkyl, aralkyl, cycloalkyl, aryl, 

15 heteroaryl, alkenyl, alkynyl, each of which may be optionally substituted with halo, hydroxy, 
0x0, nitro, haloalkyl, alkyl, alkaryl, aryl, aralkyl, alkoxy, aryloxy, amino, alkylamino, 
dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl amino, diheteroaryl amino, 
acylamino, alkylcarbamoyl, aiylcarbamoyl, aminoalkyl, alkoxycarbonyl, carboxy, 
hydroxyalkyl, alkanesulfonyl, alkanesulfonamido, arenesulfonamido, aralkylsulfonamido, 

20 alkylcarbonyl, acyloxy, cyano, or ureido; or R"^, R^ or R^ together combine with R'^ to form 
an [-O-CH2-] covalently bound bridge between the sugar T and 4' carbons. 
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is: 



•Y. 



or 



X. 



Xi 



:P 



Zl 



or 



Xi 



Xi 



:P 



■Y- 



; H; OH; OCH3; W^; an abasic nucleotide; or absent; 

(a preferred Al , especially with regard to anti-sense strands, is chosen from 5'- 
monophosphate ((HO)2(0)P-0-5'), 5 '-diphosphate ((HO)2(0)P-0-P(HO)(0)-0-5'), 5'- 
triphosphate ((HO)2(0)P-0-(HO)(0)P-0-P(HO)(0)-0-5'), 5'-guanosine cap (7-methylated or 
non-methylated) (7m-G-0-5'-(HO)(0)P-0-(HO)(0)P-0-P(HO)(0)-0-5'), 5'-adenosine cap 
(Appp), and any modified or tmmodified nucleotide cap structure (N-0-5'-(H0)(0)P-0- 
(H0)(O)P-O-P(H0)(O)-0-5'), 5'-monothiophosphate (phosphorothioate; (HO)2(S)P-0-5'), 5'- 
monodithiophqsphate (phospliorodithioate; (H0)(HS)(S)P-0-5'), 5'-phosphorothiolate 
((HO)2(0)P-S-5'); any additional combination of oxgen/sulfur replaced monophosphate, 
diphosphate and triphosphates (e.g. 5'-alpha-thiotriphosphate, 5'-gamma-thiotriphosphate, 
etc.), 5'-phosphoramidates ((HO)2(0)P-NH-5', (HO)(NH2)(0)P-0-5'), 5'-alkylphosphonates 
(R=alkyl=methyl, ethyl, isopropyl, propyl, etc., e.g. RP(0H)(0)-0-5'-, (OH)2(0)P-5'-CH2-), 
5'-alkyletherphosphonates (R=alkylether=methoxymethyl (MeOCH2-), ethoxymethyl, etc., 
e.g. RP(0H)(0)-0-5'-)). 
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A^is: 



X. 



:P Y; 



Z2 



X; 



is: 



-Y- 



and 
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is: 



:P 



Zi 



X 



4- 



4 



or 



Xa=P Y4 



X. 



or 



X 



4- 



X. 



:P- 



•Y 



4 



; H; Z'*; an inverted nucleotide; an abasic nucleotide; or absent. 

is OH, (CH2)nR'', (CH2)nNHRi°, (CH2)„ OR^°, (CH2)„ SR^"; 0(CH2)„R'°; 
0(CH2)nOR^°, 0(CH2)nNR^°, 0(CH2)„SR1°; 0(CH2)„SS(CH2)„0R^°, 0(CH2)nC(0)0R'°, 
NH(CH2)„R'°; NH(CH2)„NR1° ;NH(CH2)„0R1°, NH(CH2)„SR1*'; S(CH2)„R'^ S(CH2)„NR'<', 
1 0 S(CH2)„0R^°, S(CH2)„SR'° 0(CH2CH20)mCH2CH20Ri^; 0(CH2CH20)„,CH2CH2NHR1° , 
NH(CH2CH2NH)mCH2CH2NHR^°; Q-R^°, O-Q-R^*' N-Q-R^^ S-Q-R^° or -0-. is O, CH2, 
NH, or S. 

X\ X^, and X'* are each, independently, O or S. 

Y', Y^ and Y* are each, independently, OH, O", OR^ S, Se, BH3", H, NHR^ 
15 N(R% alkyl, cycloalkyl, aralkyl, aryl, or heteroaryl, each of which may be optionally 
substituted. 

Z\ Z^ and ai-e each independently O, CH2, NH, or S. Z* is OH, (CH2)„R'°, 
(CH2)nNHR'°, (CH2)„ 0R1°, (CH2)„ SR^''; 0(CH2)„R^°; 0(CH2)„0R1'', 0(CH2)„NR1°, 
0(CH2)„SR^°, 0(CH2)„SS(CH2)„0R'°, 0(CH2)„C(0)0R'°; NH(CH2)„R'°; NH(CH2)nNRi° 
20 ;NH(CH2)nORi^ NH(CH2)„SR1''; S(CH2)„R1°, S(CH2)„NR'°, S(CH2)„0R^°, S(CH2).SR"' 
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0(CH2CH20)mCH2CH20R^^ 0(CH2CH20)„,CH2CH2NHR^^ , 
NH(CH2CH2NH)^CH2CH2NHR^'; Q-R^^ O-Q-R^' N-Q>R'^ S-Q-R'^ 

X is 5-100, chosen to comply with a length for an RNA agent described herein. 

R*^ is H; or is together combined with R"^, R^ or R^ to form an [-O-CH2-] covalently 
bound bridge between the sugar 2' and 4' carbons. 

R^ is alkyl, cycloalkyl, aryl, aralkyl, heterocyclyl, heteroaryl, amino acid, or sugar; R^ 
is NH23 alkylamino, dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl amino, 
diheteroaryl amino, or amino acid; and R^^ is H; fluorophore (pyrene, TAMRA, fluorescein, 
Cy3 or Cy5 dyes); sulftir, silicon, boron or ester protecting group; intercalating agents (e.g. 
acridines), cross-linkers (e.g. psoralene, mitomycin C), porphyrins (TPPC4,texapliyrin, 
Sapphjnrin), polycyclic aromatic hydrocarbons (e.g., phenazine, dihydrophenazine), artificial 
endonucleases (e.g. EDTA), lipohilic carriers (cholesterol, cholic acid, adamantane acetic 
acid, 1 -pyrene butyric acid, dihydrotestosterone, 1, 3 -Bis-O(hexadecyl) glycerol, 
geranyloxyhexyl group, hexadecylglycerol, borneol, menthol, 1,3 -propanediol, heptadecyl 
group, palmitic acid,myristic acid,03-(oleoyl)lithocholic acid, 03-(oleoyl)cholenic acid, 
dimethoxytrityl, or phenoxazine)and peptide conjugates (e.g., antennapedia peptide. Tat 
peptide), alkylating agents, phosphate, amino, mercapto, PEG (e.g., PEG-40K), MPEG, 
[MPEG]2, polyamino; alkyl, cycloalkyl, aryl, aralkyl, heteroaryl; radiolabelled markers, 
enzymes, haptens (e.g. biotin), transport/absorption facilitators (e.g., aspirin, vitamin E, folic 
acid), synthetic ribonucleases (e.g., imidazole, bisimidazole, histamine, imidazole clusters, 
acridine-imidazole conjugates, Eu3-i- complexes of tetraazamacrocycles); or an RNA agent, 
m is 0-1,000,000, and n is 0-20, Q is a spacer selected from the group consisting of abasic 
sugar, amide, carboxy, oxyamine, oxyimine, thioether, disulfide, thiourea, sulfonamide, or 
morpholino, biotin or fluorescein reagents. 
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Preferred RNA agents in which the entire phosphate group has been replaced have the 
following structure (see Formula 3 below): 




A40 



FORMULA 3 



Referring to Formula 3, A^^-A"^^ is L-G-L; A^^ and/or A'^^may be absent, in which L 
is a linker, wherein one or both L may be present or absent and is selected from the jgroup 
10 consisting of CH2(CH2)g; N(CH2)g; 0(CH2)g; S(CH2)g. G is a functional group selected ftom 
tlie group consisting of siloxane, carbonate, carboxymethyl, carbamate, amide, thioether, 
ethylene oxide linker, sulfonate, sulfonamide, tliioformacetal, formacetal, oxime, 
methylenehnino, methylenemetliylimino, methylenehydrazo, methylenedimethylhydrazo and 
methyleneoxymethylimino. 
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j^io^ -gzo^ j^ju ^^^j^^ independently, H, (i,e. abasic nucleotides), adenine, 
guanine, cytosine and uracil, inosine, thymine, xanthine, hypoxanthine, nubularine, 
tubercidine, isoguanisine, 2-aminoadenine, 6-methyl and other alkyl derivatives of adenine 
and guanine, 2-propyl and other alkyl derivatives of adenine and guanine, 5-halouracil and 
cytosine^ 5-propynyl uracil and cytosine, 6-azo uracil, cytosine and thymine, 5-uracil 
(pseudouracil), 4-thioiiracil, 5-halouracil, 5-(2-aminopropyl)uracil, 5-amino allyl uracil, 8- 
halo, amino, thiol, thioalkyl, hydroxyl and other 8 -substituted adenines and guanines, 5- 
trifluoromethyl and other 5-substituted uracils and cytosines, 7-methylguanine, 5 -substituted 
pyrimidines, 6-a2apyrimidines and N-2, N-6 and 0-6 substituted purmes, including 2- 
aminopropyladenine, 5-propynyluracil and 5-propynylcytosine, dihydrouracil, 3-deaza-5- 
azacytosine, 2-aminopurine, 5-alkyluracil, 7-alkylguanine, 5-alkyl cytosine,7-deazaadenine, 
7-deazaguanine, N6, N6-dimethyladenine, 2,6-diaminopurme, 5-amino-allyl-uracil, N3- 
methyluracil substituted 1 ,2,4-triazoles, 2-pyridinone, 5-nitroindole, 3-mtropyrrole, 5- 
methoxyxxracil, uracil-5-oxyacetic acid, 5-methoxycarbonylmethyluracil, 5-metliyl-2- 
thiouracil, 5-methoxycarbonylmethyl-2-thiouracil, 5-methylaminomethyl-2-thiouracil, 3-(3- 
amino-3carboxypropyl)uracil, 3-methylcytosine, 5-methylcytosine, N'^-acetyl cytosine, 2- 
thiocytosine, N6-methyladenine, N6-isopentyladenine, 2-methylthio-N6-isopentenyladenine, 
N-methylguanines, or O-alkylated bases. 

R^^ R^^ and R^° are each, independently, OR^ 0(CH2CH20)„,CH2CH20R^ 
0(CH2)nR^ 0(CH2)nOR^ H; halo; NH2; NHR^; N(R^)2; NH(CH2CH2NH)n,CH2CH2R^; 
NHC(0)R^;; cyano; mercapto, SR'^; alkyl-thio-alkyl; alkyl, aralkyl, cycloalkyl, aryl, 
heteroaryl, alkenyl, alkynyl, each of which may be optionally substituted with halo, hydroxy, 
0x0, nitro, haloalkyl, alkyl, alkaryl, aryl, aralkyl, alkoxy, aryloxy, amino, alkylamino, 
dialkylamino, heterocyclyl, arylammo, diaryl amino, heteroaryl amino, diheteroaryl amino, 
acylamino, alkylcarbamoyl, arylcarbamoyl, aminoalkyl, alkoxycarbonyl, carboxy, 
hydroxyalkyl, alkanesulfonyl, alkanesulfonamido, arenesulfonamido, aralkylsulfonamido, 
alkylcarbonyl, acyloxy, cyano, and ureido groups; or R"^^, R^^, or R^° together combine with 
R'^^ to form an [-O-CH2-] covalently bound bridge between the sugar 2' and 4' carbons, 
x is 5-100 or chosen to comply with a length for an RNA agent described herein. 
R^^ is H; or is together combined with R^^, R^^ or R^^ to form an [-O-CH2-] 
covalently bound bridge between the sugar 2' and 4' carbons. 
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R is alkyl, cycloalkyl, aryl, aralkyl, heterocyclyl, heteroaryl, amino acid, or sugar; 
and is NH2, alkylamino, dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl 
amino, diheteroaryl amino, or amino acid, m is 0-1,000,000, n is 0-20, and g is 0-2. 

Preferred nucleoside surrogates have the following structure (see Formula 4 below): 



SLR*^^-(M-SLR^%-M-SLR^°^ 



FORMULA 4 



S is a nucleoside surrogate selected from the group consisting of mophilino, 
cyclobutyl, pyrrolidine and peptide nucleic acid. L is a linker and is selected from the group 
consisting of CH2(CH2)g; N(CH2)g; 0(CH2)g; S(CH2)g; -C(0)(CH2)n-or may be absent, M is 
an amide bond; sulfonamide; sulfinate; phosphate group; modified phosphate group as 
described herein; or may be absent. 

are each, independently, H (i.e., abasic nucleotides), adenine, 
guanine, cytosine and uracil, inosine, thymine, xanthme, hypoxanthine, nubularine, 
tubercidine, isoguanisine, 2-aminoadenine, 6-methyl and other alkyl derivatives of adenine 
and guanine, 2-propyl and other alkyl derivatives of adenine and guanine, 5-halouracil and 
cytosme, S-propjniyl uracil and cj^osine, 6-azo uracil, cytosine and thymine, 5-uracil 
(pseudouracil), 4-thiouracil, 5-halouracil, 5-(2-aminopropyl)uracil, 5-amino allyl uracil, 8- 
halo, amino, thiol, thioalkyl, hydi'oxyl and other 8-substituted adenines and guanines, 5- 
trifluoromethyl and other 5-substituted uracils and cj^osines, 7-methylguanuie, 5-substituted 
pyrimidines, 6-azapyrimidines and N-2, N-6 and 0-6 substituted purines, including 2- 
aminopropyladenine, 5-propynyluracil and 5-propynylcytosine, dihydrouracil, 3-deaza-5- 
azacytosine, 2-aminopurine, 5-alkyluracil, 7-alkylguanine, 5 -alkyl cytosine,7-deazaadenine, 
7-deazaguanine, N6, N6-dimethyladenine, 2,6-diammopurine, 5-amino-allyl-uracil, N3- 
methylui-acil substituted 1, 2, 4,-triazoles, 2-pyridinones, 5-nitroindole, 3-nitropyrrole, 5- 
methoxyuracil, uracil-5-oxyacetic acid, 5-methoxycarbonylmethyluracil, 5-methyl-2- 
thiouracil, 5-methoxycarbonylmethyl-2-thiouracil, 5-metliylaminomethyl-2-thiouracil, 3-(3- 
amino-3carboxypropyl)uracil, 3-methylcytosine, 5-methylcytosine, N'^-acetyl cytosine, 2- 
tliiocytosine, N6-methyladenine, N6-isopentyladenine, 2-methylthio-N6-isopentenyladenine, 
N-methylguanines, or O-alkylated bases. 
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X is 5-100, or chosen to comply with a length for an RNA agent described herein; and 
g is 0-2. 

Nuclease resistant monomers 
6 In one aspect, the invention features a nuclease resistant mononaerj or a an iRNA 

agent which incorporates a nuclease resistant monomer (NMR), such as those described 
herein and those described in copending, co-owned United States Provisional Application 
Serial No. 60/469,612 (Attorney Docket No. 14174-069P01), filed on May 9, 2003, which is 
hereby incorporated by reference. 

10 In addition, the invention includes iRNA agents having a NMR and another element 

described herein. E.g., the invention includes an iRNA agent described herein, e.g., a 
palindromic iRNA agent, an iRNA agent having a non canonical pairing, an iRNA agent 
which targets a gene described herein, e.g., a gene active in the liver, an iRNA agent havmg 
an architecture or structure described herein, an iRNA associated with an amphipathic 

15 delivery agent described herein, an iRNA associated with a drug delivery module described 
herein, an iRNA agent administered as described herein, or an iRNA agent formulated as 
described herein, which also incorporates a NMR. 

An iRNA agent can include monomers which have been modifed so as to inhibit 
degradation, e.g., by nucleases, e.g., endonucleases or exonucleases, found in the body of a 

20 subject. These monomers are referred to herein as NRM's, or nuclease resistance promoting 
monomers or modifications. In many cases these modifications will modulate other 
properties of the iRNA agent as well, e.g., the ability to mteract with a protein, e.g., a 
transport protein, e.g., serum albumin, or a member of the RISC (RNA-induced Silencing 
Complex), or the ability of the first and second sequences to form a duplex with one another 

25 or to form a duplex with another sequence, e.g., a target molecule. 

While not wishing to be bound by theory, it is believed that modifications of the 
sugar, base, and/or phosphate backbone in an iRNA agent can enhance endonuclease and 
exonuclease resistance, and can enliance interactions with transporter proteins and one or 
more of the functional components of the RISC complex. Preferred modifications are those 

30 that increase exonuclease and endonuclease resistance and thus prolong the halflife of tlae 
iRNA agent prior to interaction with the RISC complex, but at the same time do not render 
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the iRNA agent resistant to endonuclease activity in the RISC complex. Again, while not 
wishing to be bound by any tlieory, it is believed that placement of the modifications at or 
near the 3 ' and/or 5 ' end of antisense strands can result in iRNA agents that meet the 
preferred nuclease resistance criteria delineated above. Again, still while not wishing to be 
bound by any theory, it is believed that placement of the modifications at e.g.^ the middle of a 
sense strand can result in iRNA agents that are relatively less likely to undergo off-targeting. 

Modifications described herein can be incorporated into any double-standed RNA and 
RNA-like molecule described herein, e.g., an iRNA agent. An iRNA agent may mclude a 
duplex comprising a hybridized sense and antisense strand, in which the antisense strand 
and/or the sense strand may include one or more of the modifications described herein. The 
anti sense strand may include modifications at the 3' end and/or the 5' end and/or at one or 
more positions that occur 1-6 (e.g., 1-5, 1-4, 1-3, 1-2) nucleotides from either end of the 
strand. The sense strand may include modifications at the 3' end and/or the 5' end and/or at 
any one of the intervening positions between the two ends of the strand. The iRNA agent 
may also include a duplex comprising two hybridized antisense strands. The first and/or the 
second antisense strand may include one or more of the modifications described herein. 
Thus, one and/or both antisense strands may include modifications at the 3' end and/or the 5' 
end and/or at one or more positions that occur 1-6 (e.g., 1-5, 1-4, 1-3, 1-2) nucleotides from 
either end of the strand. Particular configurations are discussed below. 

Modifications that can be useful for producing iRNA agents that meet the preferred 
nuclease resistance criteria delineated above can include one or more of the following 
chemical and/or stereochemical modifications of the sugar, base, and/or phosphate backbone: 

(i) chiral (Sp) thioates. Thus, preferred NRM's include nucleotide dhners with an 
enriched or pure for a particular chiral form of a modified phosphate group contauiing a 
heteroatom at the nonbridging position, e.g., Sp or Rp, at the position X, where this is the 
position normally occupied by the oxygen. The atom at X can also be S, Se, Nr25 or Br 3. 
When X is S, enriched or chirally pure Sp linkage is preferred. Enriched means at least 70, 
80, 90, 95, or 99% of the preferred form. Such NRM's are discussed m more detail below; 

(ii) attachment of one or more cationic groups to the sugar, base, and/or the 
phosphorus atom of a phosphate or modified phosphate backbone moiety. Thus, preferred 
NRM's include monomers at the terminal position derivitized at a cationic group. As the 5' 
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end of an antisense sequence should have a terminal -OH or phosphate group this NRM is 
preferraly not used at th 5' end of an anti-sense sequence. The group should be attached at a 
position on the base which minimizes intererence with H bond formation and hybridization, 
e.g., away form the face which intereacts with the complementary base on the other strand, 
e.g, at the 5' position of a pyrimidine or a 7-position of a purine. These are discussed in 
more detail below; 

(iii) nonphosphate linkages at the termini. Thus, preferred NRM's include Non- 
phosphate linkages, e.g., a linkage of 4 atoms which confers greater resistance to cleavage 
than does a phosphate bond. Examples include 3' CH2-NCH3-0-CH2-5' and 3' CH2-NH- 
(0=)-CH2-5'.; 

(iv) 3 '-bridging thiophosphates and 5 '-bridging thiophosphates. Thus, preferred 
NRM's can inlcuded these structures; 

(v) L-RNA, 2'-5' likages, inverted linkages, a-nucleosides. Thus, other preferred 
NElM's include: L nucleosides and dimeric nucleotides derived from L-nucleosides; 2'-5' 
phosphate, non-phosphate and modified phosphate linkages (e.g., thiophospahtes, 
phosphoramidates and boronophosphates); dimers having inverted linkages, e.g., 3 '-3' or 5'- 
5' linkages; monomers having an alpha linkage at the 1 ' site on the sugar, e.g., the structures 
described herein having an alpha linkage; 

(vi) conjugate groups. Thus, preferred NRM's can include e.g., a targeting moiety or 
a conjugated ligand described herein conjugated with the monomer, e.g., through the sugar , 
base, or backbone ; 

(vi) abasic linkages. Thus, preferred NRM's can include an abasic monomer, e.g., an 
abasic monomer as described herein (e.g., a nucleobaseless monomer); an aromatic or 
heterocyclic or polyheterocyclic aromatic monomer as described herein.; and 

(vii) 5'-phosphonates and 5'-phosphate prodrugs. Thus, preferred NRM's include 
monomers, preferably at the terminal position, e.g., the 5' position, in which one or more 
atoms of tlie phosphate group is derivatized with a protecting group, which protecting group 
or groups, are removed as a result of the action of a component in the subject's body, e.g, a 
carboxyesterase or an enzyme present in the subject's body. E.g., a phosphate prodrug in 
which a carboxy esterase cleaves the protected molecule resulting in the production of a 
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thioate anion which attacks a carbon adjacent to the O of a phosphate and resulting in tlie 
production of an uprotected phosphate. 

One or more different NRM modifications can be introduced into an iRNA agent or 
into a sequence of an iRNA agent. An NRM modification cein be used more than once in a 
sequence or in an iRNA agent. As some NRM's interfere with hybridization the total 
number incorporated, should be such that acceptable levels of iRNA agent duplex formation 
are maintainted. 

In some embodiments NRM modifications are introduced into the terminal the 
cleavage site or in the cleavage region of a sequence (a sense strand or sequence) which does 
not target a desired sequence or gene in the subject. This can reduce off-target silencing. 

Chiral Sp Thioates 

A modification can include the alteration, e.g., replacement, of one or both of the 
non-linking (X and Y) phosphate oxygens and/or of one or more of the linking (W and Z) 
phosphate oxygens. Formula X below depicts a phosphate moiety linking two sugar/sugar 
surrogate-base moities, SBi and SB2. 



W 



SB 



1 



SB- 



FORMULA X 



In certain embodiments, one of the non-linking phosphate oxygens in the phosphate 
backbone moiety (X and Y) can be replaced by any one of the following: S, Se, BR3 (R is 
hydrogen, alkyl, aryl, etc.), C (i.e., an alkyl group, an aryl group, etc.), H, NR2 (R is 
hydrogen, alkyl, aryl, etc.), or OR (R is alkyl or aryl). The phosphorus atom in an 
unmodified phosphate group is achiral. However, replacement of one of the non-linking 
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oxygens with one of the above atoms or groups of atoms renders the phosphorus atom chiral; 
in other words a phosphorus atom in a phosphate group modified in this way is a stereogenic 
center. The stereogenic phosphorus atom can possess either the "R" configuration (herein 
Rp) or the "S" configuration (herein Sp). Tlius if 60% of a population of stereogenic 
phosphorus atoms have the Rp configuration, then the remaining 40% of the population of 
stereogenic phosphorus atoms have tlie Sp configuration. 

In some embodiments, iRNA agents, having phosphate groups in which a phosphate 
non-linkmg oxygen has been replaced by another atom or group of atoms, may contain a 
population of stereogenic phosphorus atoms in which at least about 50% of these atoms (e.g., 
at least about 60% of these atoms, at least about 70% of these atoms, at least about 80% of 
these atoms, at least about 90% of these atoms, at least about 95% of these atoms, at least 
about 98% of these atoms, at least about 99% of these atoms) have the Sp configuration. 
Alternatively, iRNA agents having phosphate groups in which a phosphate non-lmking 
oxygen has been replaced by another atom or group of atoms may contain a population of 
stereogenic phosphorus atoms in which at least about 50% of these atoms (e.g., at least about 
60% of these atoms, at least about 70% of these atoms, at least about 80% of these atoms, at 
least about 90% of these atoms, at least about 95% of these atoms, at least about 98% of 
these atoms, at least about 99% of these atoms) have the Rp configuration, hi other 
embodunents, the population of stereogenic phosphorus atoms may have the Sp 
configuration and may be substantially free of stereogenic phosphorus atoms having the Rp 
configuration. In still other embodiments, the population of stereogenic phosphorus atoms 
may have the Rp configuration and may be substantially free of stereogenic phosphorus 
atoms having tiie Sp configuration. As used herein, the phrase "substantially free of 
stereogenic phosphorus atoms having the Rp configuration" means that moieties containing 
stereogenic phosphorus atoms having the Rp configuration cannot be detected by 
conventional methods known m tiie art (chiral HPLC, NMR analysis using chiral shift 
reagents, etc.). As used herein, tiie phrase "substantially free of stereogenic phosphorus 
atoms having tiie Sp configuration" means tiiat moieties containing stereogenic phosphorus 
atoms having tiie Sp configuration cannot be detected by conventional metliods known in the 
art (cliiral HPLC, NMR analysis using chhal shift reagents, etc.). 
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In a preferred embodiment, modified iKNTA agents contain a phosphorothioate group, 
i.e., a phosphate groups m which a phosphate non-linking oxygen has been replaced by a 
sulfur atom. In an especially preferred embodiment, the population of phosphorothioate 
stereogenic phosphorus atoms may have the Sp configuration and be substantially free of 
stereogenic phosphorus atoms having the Rp configuration. 

Phosphoi-othioates may be incorporated into iRNA agents using diniers e.g., formulas 
X-1 and X-2. The former can be vised to introduce phosphorothioate 



DMTO 



BASE 




DMTO 



BASE 



solid phase reagent 



BASE 




BASE 



N(ipr): 



X-1 



X-2 



at the 3' end of a strand, whUe the latter can be used to introduce this modification at the 5' 
end or at a position that occurs e.g., 1, 2, 3, 4, 5, or 6 nucleotides from either end of the 
strand. In the above formulas, Y can be 2-cyanoetlioxy, W and Z can be O, R2- can be, e.g., ■< 
substituent that can impart the C-3 endo configuration to the sugar (e.g., OH, F, OCH3), 
DMT is dimetlioxytrityl, and "BASE" can be a natural, miusual, or a universal base. 
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X-1 and X-2 caii be prepared using chiral reagents or directing groups that can result 
in phosphorotlaioate-containing dimers having a population of stereogenic phosphorus atoms 
having essentially only the Rp configuration (i.e., being substantially free of the Sp 
configuration) or only the Sp configuration (i.e., being substantially free of the Rp 
configui-ation). Alternatively, dimers can be prepared having a population of stereogenic 
phosphorus atoms in vAnch about 50% of the atoms have the Rp configuration and about 
50% of the atoms have the Sp configuration. Dimers having stereogenic phosphorus atoms 
with the Rp configuration can be identified and separated from dimers having stereogenic 
phosphorus atoms with the Sp configuration using e.g., enzymatic degradation and/or 
conventional chromatography techniques. 

Cationic Groups 

Modifications can also include attachment of one or more cationic groups to the 
sugar, base, and/or the phosphorus atom of a phosphate or modified phosphate backbone 
moiety. A cationic group can be attached to any atom capable of substitution on a natural, 
unusual or viniversal base. A preferred position is one that does not interfere with 
hybridization, i.e., does not interfere with the hydrogen bonding interactions needed for base 
pairing. A cationic group can be attached e.g., through the C2' position of a sugar or 
analogous position in a cycUc or acycUc sugar surrogate. Cationic groups can include e.g., 
protonated amino groups, derived ftom e.g., O-AMINE (AMINE = NH2; alkylamino, 
dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl amino, or diheteroaryl 
amino, ethylene diamine, polyamino); aminoalkoxy, e.g., 0(CH2)nAMINE, (e.g., AMINE = 
NH2; alkylamino, dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl amino, or 
diheteroaryl amino, ethylene diamine, polyamino); amino {e.g. NH2; alkylamino, 
dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl amino, diheteroaryl amino, 
or amino acid); or NH(CH2CH2NH)nCH2CH2-AMINE (AMINE = NH2; alkylamino, 
dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl amino,or diheteroaryl 
amino). 

Nonphosphate Linkages 

Modifications can also include the incorporation of nonphosphate linkages at the 5' 
and/or 3' end of a strand. Examples of nonphosphate linkages which can replace the 
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phosphate group include methyl phosphonate, hydroxylamino, siloxane, carbonate, 
carboxymethyl, carbamate, amide, thioether, ethylene oxide linker, sulfonate, sulfonamide, 
thioformacetal, formacetal, oxime, methyleneimino, methylenemethy limine, 
methylenehydrazo, methylenedimethylhydrazo and methyleneoxymethyUmino. Preferred 
replacements include the methyl phosphonate and hydroxylamino groups. 

3 '-bridging thiophosphates and 5 '-bridging thiophosphates; locked-RNA, 2 -5 ' 
likages, inverted linkages, a-nucleosides; conjugate groups; abasic linkages; and 5 - 
phosphonates and 5 '-phosphate prodrugs 

Referring to formula X above, modifications can include replacement of one of the 
bridging or linking phosphate oxygens in the phosphate backbone moiety (W and Z). Unlike 
tlie situation where only one of X or Y is altered, the phosphorus center in the 
phosphorodithioates is achiral which precludes the formation of iRNA agents containing a 

stereogenic phosphorus atom.. 

Modifications can also include linkmg two sugars via a phosphate or modified 
phosphate group through the T position of a first sugar and the 5' position of a second sugar. 
Also contemplated are inverted linkages in which both a first and second sugar are cached 
linked through the respectiveS' positions. Modified RNA's can also include "abasic" sugars, 
which lack a nucleobase at C-1'. The sugar group can also contain one or more carbons that 
possess the opposite stereochemical configuration than that of the corresponding carbon in 
ribose. Thus, a modified iRNA agent can include nucleotides containing e.g., arabinose, as 
the sugar. In another subset of this modification, the natural, imusual, or universal base may 
have the a-configuration. Modifications can also include L-RNA. 

5' 

Modifications can also include 5 '-phosphonates, e.g., P(0)(0'*)2-X-C -sugar (X= 
CH2, CF2, CHF and 5'-phosphate prodrugs, e.g., P(0)[OCH2CH2SC(0)R]2CH2C''-sugar. 
In the latter case, the prodrug groups may be decomposed via reaction first with carboxy 
esterases. The remaining ethyl thiolate group via intramolecular Sn2 displacement can depart 
as episulfide to afford the underivatized phosphate group. 

Modification can also include the addition of conjugating groups described elseqhere 
herein, which are prefereably attached to an iRNA agent through any amino group available 
for conjugation. 
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Nuclease resistant modifications include some which can be placed only at the 
terminus and others which can go at any position. Generally the modifications that can 
inhibit hybridization so it is preferably to use them only in terminal regions, and preferrable 
to not use them at the cleavage site or in the cleavage region of an sequence which targets a 
subject sequence or gene.. The can be used anywhere in a sense sequence, provided that 
sufficient hybridization between the two sequences of the iRNA agent is maintained. In 
some embodiments it is desirabable to put the NRM at the cleavage site or in the cleavage 
region of a sequence which does not target a subject sequence or gene,as it can minimize off- 
target silencing. 

In addition, an iRNA agent described herein can have an overhang which does not 
form a duplex stioxcture with the other sequence of the iRNA agent— it is an overhang, but it 
does hybridize, either with itself, or with another nucleic acid, other than the other sequence 
of the iRNA agent. 

In most cases, the nuclease-resistance promoting modifications will be distributed 
differently depending on whether the sequence will target a sequence m the subject (often' 
referred to as an anti-sense sequence) or will not tai-get a sequence in the subject (often 
referred to as a sense sequence). If a sequence is to target a sequence in the subject, 
modifications which interfer with or inhibit endonuclease cleavage should not be inserted in 
tlie region which is subject to RISC mediated cleavage, e.g., the cleavage site or the cleavage 
region (As described in Elbashir et al, 2001, Genes and Dev. 15: 188, hereby incorporated 
by reference, cleavage of the target occurs about in the middle of a 20 or 21 nt guide RNA, or 
about 10 or 11 nucleotides upstream of the furst nucleotide which is complementary to the 
guide sequence. As used herein cleavage site refers to the nucleotide on either side of the 
cleavage site, on die target or on the iRNA agent strand which hybridizes to it. Cleavage 
region means an nucleotide with 1, 2, or 3 nucletides of the cleave site, in either direction.) 

Such modifications can be introduced into the terminal regions, e.g., at the terminal 
position or with 2, 3, 4, or 5 positions of the termmus, of a sequence which targets or a 
sequence which does not target a sequence in the subject. 

An iRNA agent can have a first and a second strand chosen from the foUowmg: 
a first strand which does not target a sequence and which has an NRM modification at 
or within 1, 2, 3, 4, 5 , or 6 positions from tlie 3' end; 



97 



wo 2004/080406 PCT/US2004/007070 

a first strand which does not target a sequence and which has an NRM modification at 
or within 1, 2, 3, 4, 5 , or 6 positions from the 5' end; 

a first strand which does not target a sequence and which has an 1>JRM modification at 
or within 1, 2, 3, 4, 5 , or 6 positions from the 3' end and which has a NRM modification at 
or within 1, 2, 3, 4, 5 , or 6 positions from the 5' end; 

a first strand which does not target a sequence and which has an NRM modification at 
the cleavage site or in the cleavage region; 

a first strand which does not target a sequence and wliich has an NRM modification at 
the cleavage site or in the cleavage region and one or more of an NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions from the 3' end, a NRM modification at or within 1, 2, 3, 
4, 5 , or 6 positions from the 5' end, or NRM modifications at or within 1, 2, 3, 4, 5 , or 6 
positions from both tlie 3' and tlie 5' end; and 

a second strand which targets a sequence and which has an NRM modification at or 

within 1, 2, 3, 4, 5 , or 6 positions from the 3' end; 

a second strand which targets a sequence and which has an NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions from the 5' end (5' end NRM modifications axe 
preferentially not at the terminus but rather at a position 1, 2, 3, 4, 5 , or 6 away from the 5' 

terminus of an antisense strand); 

a second strand which targets a sequence and which has an NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions from the 3' end and which has a NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions from the 5' end; 

a second strand which targets a sequence and which preferably does not have an an 
NRM modification at the cleavage site or in the cleavage region; 

a second strand which targets a sequence and which does not have an NRM 
modification at the cleavage site or in the cleavage region and one or more of an NRM 
modification at or withm 1, 2, 3, 4, 5 , or 6 positions from the 3' end, aNRM modification at 
or within 1, 2, 3, 4, 5 , or 6 positions from the 5' end, or NRM modifications at or within 1, 2, 
3, 4, 5 , or 6 positions from bottithe 3' and the 5' end(5' end NRM modifications are 
preferentially not at the terminus but rather at a position 1, 2, 3, 4, 5 , or 6 away from tiie 5' 
terminus of an antisense strand). 
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An iRNA agent can also target two sequences and can have a first and second strand 
chosen from: 

a first strand which targets a sequence and which has an NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions from the 3' end; 

a first strand which targets a sequence and which has an NRM modification at or 
witliin 1, 2, 3, 4, 5 , or 6 positions from the 5' end (5' end NRM modifications are 
preferentially not at the terminus but rather at a position 1, 2, 3, 4, 5 , or 6 away from the 5' 
terminus of an antisense strand); 

a first strand which targets a sequence and which has an NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions from the 3' end and which has a NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions from the 5' end; 

a first strand which targets a sequence and which preferably does not have an an 
NRM modification at the cleavage site or m the cleavage region; 

a first strand which targets a sequence and which dose not have an NRM modification 
at the cleavage site or in the cleavage region and one or more of an NRM modification at or 
vAthin 1, 2, 3, 4, 5 , or 6 positions from the 3' end, a NRM modification at or within 1, 2, 3, 
4, 5 , or 6 positions from the 5' end, or NRM modifications at or within 1, 2, 3, 4, 5 , or 6 
positions from both the 3' and the 5' end(5' end NRM modifications are preferentially not at 
the terminus but rather at a position 1 , 2, 3, 4, 5 , or 6 away from the 5 ' terminus of an 

antisense strand) and 

a second strand which targets a sequence and which has an NRM modification at or 

within 1, 2, 3, 4, 5 , or 6 positions fi'om the 3' end; 

a second strand which targets a sequence and which has an NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions from the 5' end (5' end NRM modifications are 
preferentially not at the teraiinus but rather at a position 1, 2, 3, 4, 5 , or 6 away from the 5' 
terminus of an antisense strand); 

a second strand which targets a sequence and which has an NRM modification at or 
witiiin 1, 2, 3, 4, 5 , or 6 positions from the 3' end and which has a NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions fi^om the 5' end; 

a second strand which targets a sequence and which preferably does not have an an 
NRM modification at the cleavage site or in the cleavage region; 
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a second strand which targets a sequence and which dose not have an NRM 
modification at the cleavage site or in the cleavage region and one or more of an NRM 
modification at or within 1, 2, 3, 4, 5 , or 6 positions from the 3' end, a NRM modification at 
or within 1, 2, 3, 4, 5 , or 6 positions from the 5' end, or NRM modifications at or within 1, 2, 
3, 4, 5 , or 6 positions from both the 3' and the 5' end(5' end NRIVI modifications are 
preferentially not at the terminus but rather at a position 1, 2, 3, 4, 5 , or 6 away fi-om the 5' 
terminus of an antisense strand). 



Ribose Mimics 

In one aspect, the invention features a ribose mimic, or an iRNA agent which 
incorporates a ribose mimic, such as those described herein and those described in copending 
co-owned United States Provisional Application Serial No. 60/454,962 (Attorney Docket No. 
14174-064P01), filed on March 13, 2003, which is hereby incorporated by reference. 

In addition, the invention includes iRNA agents having a ribose mimic and another 
element described herein. E.g., the invention includes an iRNA agent described herein, e.g., 
a palindromic iRNA agent, an iRNA agent having a non canonical pairing, an iRNA agent 
which tai-gets a gene described herein, e.g., a gene active in the liver, an iRNA agent liaving 
an architecture or structui-e described herein, an iRNA associated with an amphipathic 
delivery agent described herein, an iRNA associated with a drug delivery module described 
herein, an iRNA agent administered as described herein, or an iRNA agent formulated as 
described herein, which also uacorporates a ribose mimic. 

Thus, an aspect of the invention features an iRNA agent that includes a secondary 
hydroxyl group, which can increase efficacy and/or confer nuclease resistance to the agent. 
Nucleases, e.g., cellular nucleases, can hydi-olyze nucleic acid phosphodiester bonds, 
resulting in partial or complete degradation of the nucleic acid. The secondary hydroxy 
group confers nuclease resistance to an iRNA agent by rendering the iRNA agent less prone 
to nuclease degradation relative to an iRNA which lacks tlie modification. While not 
wishing to be bound by theory, it is believed that the presence of a secondary hydroxyl group 
on the iRNA agent can act as a structural mimic of a 3' ribose hydroxyl group, thereby 
causing it to be less susceptible to degradation. 
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The secondary hydroxyl group refers to an "OH" radical that is attached to a carbon 
atom substituted by two other carbons and a hydrogen. The secondary hydroxyl group that 
confers nuclease resistance as described above can be part of any acyclic carbon-containing 
group. The hydroxyl may also be part of any cyclic carbon-containing group, and preferably 
one or more of the following conditions is met (1) there is no ribose moiety between the 
hydroxyl group and the terminal phosphate group or (2) the hydroxyl group is not on a sugar 
moiety which is coupled to a base.. The hydroxyl group is located at least two bonds (e.g., at 
least three bonds away, at least four bonds away, at least five bonds away, at least six bonds 
away, at least seven bonds away, at least eight bonds away, at least nine bonds away, at least 
ten bonds away, etc.) firom the terminal phosphate group phosphorus of the iRNA agent. In 
preferred embodiments, there are five intervening bonds between the terminal phosphate 
group phosphorus and the secondary hydroxyl group. 

Preferred iRNA agent delivery modules with five intervenhig bonds between the 
terminal phosphate group phosphorus and the secondary hydroxyl group have the following 
structure (see formula Y below): 



A 



W 



'CH2 R3 _ 

R2 I R5 

OR7 Re 



(Y) 



Referring to formula Y, A is an iRNA agent, including any iRNA agent described 
herein. The iRNA agent may be connected directly or induectly (e.g., through a spacer or 
linker) to "W" of the phosphate group. These spacers or linkers can include e.g., -(CH2)n-, 
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(CH2)„N-, -(CH2)„0-, -(CH2)„S-, 0(CH2CH20)„CH2CH20H (e.g., n = 3 or 6), abasic sugars, 
amide, carboxy, amine, oxyamine, oxyimine, thioether, disulfide, thiourea, sulfonamide, or 
morpholino, or biotin and fluorescein reagents. 

The iRNA agents can have a terminal phosphate group that is unmodified (e.g., W, 
Y, and Z are O) or modified. In a modified phosphate group, W and Z can be independently 
NH, O, or S; and X and Y can be independently S, Se, BHs', Ci-Ce alkyl, Ce-Cio aryl, H, O, 
O", alkoxy or amino (including alkylamino, arylamino, etc.). Preferably, W, X and Z are O 
and Y is S. 

Ri and R3 are each, independently, hydrogen; or Ci-Cioo alkyl, optionally substituted 
with hydroxyl, amino, Imlo, phosphate or sulfate and/or may be optionally inserted with N, 

O, S, alkenyl or alkynyl. 

R2 is hydrogen; Ci-Cioo alkyl, optionally substituted with hydroxyl, amino, halo, 
phosphate or sulfate and/or may be optionally inserted with N, O, S, alkenyl or alkynyl; or, 
when n is 1 , R2 may be taken together with with R4 or Re to form a ring of 5-12 atoms. 

R4 is hydrogen; Ci-Cioo alkyl, optionally substituted with hydroxyl, amino, halo, 
phosphate or sulfate and/or may be optionally inserted withN, O, S, alkenyl or alkynyl; or, 
when n is 1, R4 may be taken together with with R2 or R5 to form a ring of 5-12 atoms. 

R5 is hydrogen, Ci-Cioo alkyl optionally substituted with hydroxyl, amino, halo, 
phosphate or sulfate and/or may be optionally inserted withN, O, S, alkenyl or alkynyl; or, 
when n is 1, R5 may be taken together witli with R4 to form a ring of 5-12 atoms. 

Re is hydrogen, Ci-Cioo alkyl, optionally substituted with hydroxyl, amino, halo, ' 
phosphate or svdfate and/or may be optionally inserted with N, O, S, alkenyl or alkynyl, or, 
when n is 1 , Re may be taken together with with R2 to form a ring of 6-1 0 atoms; 

R, is hydrogen, Ci-Cioo alkyl, or C(0)(CH2)qC(0)NHR9; T is hydrogen or a 
functional group; n and q are each independently 1-100; Rs is Ci-Cio alkyl or Ce-Cio aryl; 
and Rg is hydrogen, Cl-ClO alkyl, C6-C10 aryl or a solid support agent. 

Preferred embodiments may include one of more of the following subsets of iRNA 

agent delivery modules. 

In one subset of RNAi agent delivery modules, A can be connected directly or 
indirectly through a terminal 3' or 5' ribose sugar carbon of the RNA agent. 

In another subset of RNAi agent delivery modules, X, W, and Z are O and Y is S. 
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In still yet another subset of RNAi agent delivery modules, n is 1, and R2 and Re are 
taken together to form a ring containing six atoms and R4 and R5 are taken together to form a 
ring containing six atoms. Preferably, the ring system is a trans-dQcalin, For example, the 
RNAi agent delivery module of this subset can include a compound of Formula (Y-1): 



The functional group can be, for example, a targeting group (e.g., a steroid or a 
carbohydrate), a reporter group (e.g., a fluorophore), or a label (an isotopically labelled 
moiety). The targeting group can further include protein binding agents, endothelial cell 
10 targeting groups (e.g., RGD peptides and mimetics), cancer cell targeting groups (e.g., folate 
Vitamm B12, Biotin), bone cell targeting groups (e.g., bisphosphonates, polyglutamates, 
polyaspartates), multivalent mannose (for e.g., macrophage testing), lactose, galactose, N- 
acetyl-galactosamine, monoclonal antibodies, glycoproteins, lectins, melanotropin, or 
thyrotropin. 

15 As can be appreciated by the skilled artisan, methods of synthesizing the compomids 

of the formulae herein will be evident to those of ordinary skill rn the art.The synthesized 
compounds can be separated from a reaction mixture and further purified by a method such 
as column chromatography, high pressure liquid chromatography, or recrystallization. 
Additionally, the various synthetic steps may be performed in an alternate sequence or order 

20 to give the desired compounds. Synthetic chemistry transformations and protecting group 
methodologies (protection and deprotection) useful in synthesizing the compounds described 
herein are known in the art and include, for example, those such as described in R. Larock, 
Comprehensive Organic Transformations, VCH Publishers (1989); T.W. Greene and P.G.M. 
Wuts, Protective Groups in Organic Synthesis, 2d. Ed., John Wiley and Sons (1991); L. 

25 Fieser and M. Fieser, Fieser and Fieser's Reagents for Organic Synthesis, John Wiley and 



5 




HO 
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Sons (1994); and L. Paquette, ed. Encyclopedia of Reagents for Organic Synthesis, John 
Wiley and Sons (1995)3 and subsequent editions thereof. 

Ribose Replacement Monomer Subunite 
5 iRNA agents can be modified in a number of ways which can optimize one or more 

characteristics of the IRNA agent. In one aspect, the invention features a ribose replacement 
monomer subunit (RRMS), or a an iRNA agent which incorporates a RRMS, such as those 
described herein and those described in one or more of United States Provisional Application 
Serial No. 60/493,986 (Attorney Docket No. 14174-079P01), filed on August 8, 2003, which 

10 is herebj'^ incorporated by reference; United States Provisional Application Serial No. 

60/494,597 (Attorney Docket No. 14174-080P01), filed on August 11, 2003, which is hereby 
incorporated by reference; United States Provisional Application Serial No. 60/506,341 
(Attorney Docket No. 141 74-080P02), filed on September 26, 2003, which is hereby 
incorporated by reference; and in United States Provisional Application Serial No. 

16 60/158,453 (Attorney Docket No. 14174-080P03), filed on November 7, 2003, which is 
hereby incorporated by reference. 

In addition, the invention includes iRNA agents having a RRMS and another element 
described herein. E.g., the invention includes an iRNA agent described herein, e.g., a 
palindromic iRNA agent, an iRNA agent having a non canonical pairing, an iRNA agent 

20 which targets a gene described herein, e.g., a gene active in the liver, an iRNA agent having 
an archtecture or structure described herein, an iRNA associated with an amphipathic 
delivery agent described herein, an iRNA associated with a drug dehvery module described 
herein, an iRNA agent administered as described herein, or an iRNA agent formulated as 
described herein, which also incorporates a RRMS. 

25 The ribose sugar of one or more ribonucleotide subunits of an iRNA agent can be 

replaced with another moiety, e.g., a non-carbohydrate (preferably cyclic) carrier. A 
ribonucleotide subunit in which the ribose sugar of the subunit has been so replaced is 
referred to herein as a ribose replacement modification subunit (RRMS). A cyclic carrier 
may be a carbocyclic ring system, i.e., all ring atoms are carbon atoms, or a heterocyclic ring 

30 system, i.e., one or more ring atoms may be a heteroatom, e.g., nitrogen, oxygen, sulfur. The 
cyclic carrier may be a monocyclic ring system, or may contain two or more rings, e.g. fiised 
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rings. The cyclic carrier may be a fully saturated ring system, or it may contain one or more 
double bonds. 

The carriers further include (i) at least two "backbone attachment points" and (ii) at 
least one "tethering attachment point." A "backbone attachment point" as used herein refers 
5 to a functional group, e.g. a hydroxyl group, or generally, a bond available for, and that is 
suitable for incorporation of the carrier into the backbone, e.g., the phosphate, or modified 
phosphate, e.g., sulfur containing, backbone, of a ribonucleic acid. A "tethering attacliment 
point" as used herein refers to a constituent ring atom of the cyclic carrier, e.g., a carbon 
atom or a heteroatom (distinct from an atom which provides a backbone attachment point), 

10 that connects a selected moiety. The moiety can be, e.g., a ligand, e.g., a targeting or 

delivery moiety, or a moiety which alters a physical property, e.g., lipophilicity, of an iRNA 
agent. Optionally, the selected moiety is comiected by £m mtervening tether to the cyclic 
carrier. Thus, it will include a functional group, e.g., an amino group, or generally, provide a 
bond, that is suitable for incorporation or tethering of another chemical entity, e.g., a ligand 

15 to the constituent ring. 

Incorporation of one or more RRMSs described herein into an RNA agent, e.g., an 
iRNA agent, particularly when tethered to an appropriate entity, can confer one or more new 
properties to the RNA agent and/or alter, enhance or modulate one or more existing 
properties in the RNA molecule. E.g., it can alter one or more of lipophilicity or nuclease 

20 resistance. Incorporation of one or more RRMSs described herein into an iRNA agent can, 
particularly when the RRMS is tethered to an appropriate entity, modulate, e.g., increase, 
binding affinity of an iRNA agent to a target mRNA, change the geometry of the duplex 
form of the iRNA agent, alter distribution or target the iRNA agent to a particular part of the 
body, or modify the interaction with nucleic acid binding proteins (e.g., during RISC 

25 formation and strand separation). 

Accordingly, m one aspect, the invention features, an iRNA agent preferably 
comprising a first strand and a second strand, wherein at least one subunit having a fommla 
(R-1) is incorporated into at least one of said strands. 
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R"" R^ 




(R-1) 

Referring to formula (R-1), X is N(C0)R', NR' or CH2; Y is NR^ O, S, CR'R^°, or 
5 absent; and Z is CR^ 'R^'^ or absent. 

Each of R^ B?, R^ R"^, R^, and R^° is, independently, H, OR^ OR'', (CH2)„OR^ or 
(CH2)nOR^ provided that at least one of R', R^, R^ R^, R^ and R^" is OR^ or OR" and that at 
least one of R^ R^ R^ R'^, R^, and R^° is (CH2)nOR^ or (CH2)„0R'' (when the RRMS is 
terminal, one of R\ R^, R^, R*, R^, and R^° will include R" and one will include R"*; when the 
10 RRMS is, internal, two of R^ R^ R^ R\ R^ and R'° will each include an R''); further 

provided that preferably OR^ may only be present with (CH2)nOR'' and (CH2)nOR'' may only 
be present with OR". 

Each of R^, R^, R'^, and R'^ is, mdependently, H, Ci-Ce alkyl optionally substituted 
with 1-3 R^^, or C(0)NHR''; or R^ and R^^ together are C3-C8 cycloalkyl optionally 
15 substituted with R^"*. 

R' is C1-C20 alkyl substituted with NR'^R'^; R* is Cj-Ce alkyl; R^^ is hydroxy, C1-C4 
alkoxy, or halo; and R^"* is NR^'R'. 



RMs: 

20 

A 

=• B 

C 

; and 
R" is: 
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A 



O strand 



C 



Each of A and C is, independently, O or S. 
B is OH, O', or 



O 



O 



O 



•o 



•OH 



O' 



O" 



R"" is H or C1-C6 alkyl; R'' is H or a ligand; and n is 1-4. 

In a preferred embodiment the ribose is replaced with a pyrroline scaffold, and X is 
N(CO)R' or mC, Y is CR^R^°, and Z is absent. 
10 In other preferred embodiments the ribose is replaced with a piperidine scaffold, and 

X is N(C0)R' or NR', Y is CR^'°, and Z is CR^^R^l 

In other preferred embodiments the ribose is replaced with a piperazine scaffold, and 
X is N(C0)R'' or NR\ Y is NR^ and Z is CR^^R^l 

In other preferred embodiments the ribose is replaced with a morpholino scaffold, and 
15 X is N(C0)R'' or NR', Y is O, and Z is CR' ^R^^ . 

In other preferred embodiments the ribose is replaced with a decalin scaffold, and X 
isCH2; Y is CR^R'°; and Z is CR^ 'R^^; and R' and R^ ' together are cycloalkyl. 

In other preferred embodiments the ribose is replaced with a decalin/indane scafold 
and , and X is CHa; Y is CR^R^°; and Z is CR^^R^^; and R^ and R^' togetlier are 
20 cycloalkyl. 

In other preferred embodiments, the ribose is replaced with a hydroxyproline 
scaffold. 

RRMSs described herein may be incorporated into any double-stranded RNA-like 
molecule described herein, e.g., an iRNA agent. An iRNA agent may include a duplex 
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comprising a hybridized sense and antisense strand, in which the antisense strand and/or the 
sense strand may include one or more of the RRMSs described herein. An RRMS can be 
introduced at one or more points in one or both strands of a double-stranded iRNA agent. An 
RRMS can be placed at or near (within 1, 2, or 3 positions) of the 3' or 5' end of the sense 
5 strand or at near (within 2 or 3 positions of) the 3' end of the antisense strand. In some 
embodiments it is prefenred to not have an RRMS at or near (within 1 , 2, or 3 positions of) 
the 5' end of the antisense strand. An RRMS can be internal, and will preferably be 
positioned in regions not critical for antisense binding to the target. 

In an embodiment^ an iRNA agent may have an RRMS at (or within 1 , 2, or 3 

10 positions of) the 3' end of the antisense strand. In an embodiment, an iRNA agent may have 
an RRMS at (or within 1, 2, or 3 positions of) the 3' end of the antisense strand and at (or 
within 1, 2, or 3 positions of) the 3' end of the sense strand. In an embodiment, an iRNA 
agent may have an RRMS at (or within 1, 2, or 3 positions of) the 3' end of the antisense 
strand and an RRMS at the 5' end of the sense strand, in which both Ugands are located at the 

1 5 same end of the iRNA agent. 

In certain embodiments, two ligtmds are tethered, preferably, one on each strand and 
are hydrophobic moieties. While not wishing to be bound by theory, it is beheved that 
pairing of the hydrophobic ligands can stabilize the iRNA agent via intermoleculai' van der 
Waals interactions. 

20 In an embodiment, an iRNA agent may have an RRMS at (or within 1 , 2, or 3 

positions of) the 3' end of the antisense strand and an RRMS at the 5' end of the sense strand, 
in which both RRMSs may share the same ligand (e.g., choUc acid) via connection of their 
individual tethers to separate positions on the ligand. A ligand shared between two proximal 
RRMSs is referred to herein as a "hairpin Ugand." 

25 In other embodiments, an iRNA agent may have an RRMS at the 3' end of the sense 

strand and an RRMS at an internal position of the sense strand. An iRNA agent may have an 
RRMS at an internal position of the sense strand; or may have an RRMS at an internal 
position of the antisense strand; or may have an RRMS at an internal position of die sense 
strand and an RRMS at an internal position of the antisense strand. 

30 In preferred embodiments the iRNA agent includes a first and second sequences, 

which are preferably two separate molecules as opposed to two sequences located on the 
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same strand, have sufficient complementarity to each other to hybridize (and thereby form a 
duplex region), e.g., under physiological conditions, e.g., under physiological conditions but 
not in contact with a helicase or other xmwinding enz3ane. 

It is preferred that the first and second sequences be chosen such that the ds iRNA 

5 agent includes a single strand or unpaired region at one or both ends of the molecule. Thus^ a 
ds iRNA agent contains first and second sequences, preferable paired to contain an overhang, 
e.g., one or two 5' or 3' overhangs but preferably a 3* overhang of 2-3 nucleotides. Most 
embodiments will have a 3 ' overhang. Preferred sRNA agents will have single-stranded 
overhangs, preferably 3' overhangs, of 1 or preferably 2 or 3 nucleotides in length at each 

10 end. The overhangs can be the result of one strand being longer than the other, or the result 
of two strands of the same length being staggered. 5' ends are preferably phosphorylated. 

An RNA agent, e.g., an iRNA agent, containing a preferred, but nonlimiting RRMS is 
presented as formula (R-2) in FIG. 4. The carrier includes two "backbone attachment points" 
(hydroxyl groups), a "tethering attachment pomt," and a ligand, which is connected mdirectly 

15 to the carrier via an intervening tether. The RRMS may be the 5' or 3' terminal subunit of 
the RNA molecule, i.e., one of the two "W" groups may be a hydroxyl group, and the other 
"W" group may be a chain of two or more unmodified or modified ribonucleotides. 
Alternatively, the RRMS may occupy an internal position, and both "W" groups may be one 
or more unmodified or modified ribonucleotides. More than one RRMS may be present in a 

20 RNA molecule, e.g., an iRNA agent. 

The modified RNA molecide of formula (R-2) can be obtained using oligonucleotide 
synthetic methods knovra in the art. In a preferred embodhnent, the modified RNA molecule 
of formula (II) can be prepared by incorporating one or more of the correspondmg RRMS 
monomer compounds (RRMS monomers, see, e.g.. A, B, and C in FIG 4) into a growing 

25 sense or antisense strand, utilizing, e.g., phosphoramidite or H-phosphonate couplmg 
strategies. 

The RRMS monomers generally include two differently functionalized hydroxyl 
groups (OFG^ and OFG^ above), which are linked to the carrier molecule (see A in FIG 4), 

and a tethering attachment point. As used herein, the term "functionalized hydroxyl group" 
30 means that the hydroxyl proton has been replaced by another substituent. As shown in 

representative stmctures B and C, one hydroxyl group (OFG^) on the carrier is functionalized 
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^ 

with a protecting group (PG). The other hydroxyl group (OFG ) can be functionaUzed with 
either (1) a liquid or solid phase synthesis support reagent (solid circle) directly or indirectly 
through a linker, as in B, or (2) a phosphorus-containing moiety, e.g., a phosphoramidite as 
in C. The tethering attachment point may be connected to a hydrogen atom, a tether, or a 
5 tethered ligand at the time that the monomer is incorporated into the growing sense or 
antisense strand (see R in Scheme 1). Thus, the tethered ligand can be, but need not be 
attached to the monomer at the time that the monomer is incorporated into tlie growing 
strand. In certain embodiments, the tether, the ligand or the tethered ligand may be linked to 
a "precursor" RRMS after a "precursor" RRMS monomer has been incorporated into the 
10 strand. 

The (OFG^) protecting group may be selected as desired, e.g., from T.W. Greene and 
P.G.M. Wuts, Protective Groups in Organic Synthesis, 2d. Ed., John Wiley and Sons (1991). 
The protecting group is preferably stable under amidite synthesis conditions, storage 
conditions, and oligonucleotide synthesis conditions. Hydroxyl groups, -OH, are 

16 nucleophilic groups (i.e., Lewis bases), which react through the oxygen with electrophiles 
(i.e., Lewis acids), Hydroxyl groups in which the hydrogen has been replaced with a 
protecting group, e.g., a triarylmethyl group or a trialkylsilyl group, are essentially unreactive 
as nucleophiles in displacement reactions. Thus, the protected hydroxyl group is useful in 
preventing e.g., homocoupling of compounds exemplified by structure C during 

20 oligonucleotide synthesis. A preferred protecting group is the dimethoxytrityl group. 

When the OFG in B includes a linker, e.g., a long organic linker, connected to a 
soluble or insoluble support reagent, solution or solid phase synthesis techniques can be 
employed to build up a chain of natural and/or modified ribonucleotides once OFG^ is 
deprotected and free to react as a nucleophile with another nucleoside or monomer 

25 containing an electrophilic group (e.g., an amidite group). Alternatively, a natural or 

modified ribonucleotide or oligoribonucleotide chain can be coupled to monomer C via an 
amidite group or H-phosphonate group at OFG^. Subsequent to this operation, OFG^ can be 
deblocked, and the restored nucleophihc hydroxyl group can react with another nucleoside or 
monomer containing an electrophilic group (see FIG. 1). R' can be substituted or 

30 unsubstituted alkyl or alkenyl. In preferred embodiments, R' is methyl, allyl or 2- 
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cyanoethyl. R" may a Ci-Cio alkyl group, preferably it is a branched group containing three 

or more carbons, e.g., isopropyl. 

OFG^ in B can be hydroxyl functionalized with a linker, which in turn contains a 
liquid or solid phase synthesis support reagent at the otlier linker terminus. The support 
reagent can be any support medium that can support the monomers described herem. The 
monomer can be attached to an insoluble support via a Imker, L, which allows the monomer 
(and the growing chain) to be solubilized in the solvent m wliich the support is placed. The 
solubilized, yet immobilized, monomer can react with reagents in the surroundmg solvent; 
unreacted reagents and soluble by-products can be readily washed away from the solid 
support to which the monomer or monomer-derived products is attached. Alternatively, the 
monomer can be attached to a soluble support moiety, e.g., polyethylene glycol (PEG) and 
liquid phase synthesis techniques can be used to build up the chain. Linker and support 
medium selection is within skill of the art. Generally the Imker may be -C(0)(CH2)qC(0)-, 
or -C(0)(CH2)qS-, preferably, it is oxalyl, succmyl or thioglycolyl. Standard control pore 
glass solid phase synthesis supports can not be used in conjunction with fluoride labile 5' 
silyl protecting groups because the glass is degraded by fluoride with a significant reduction 
m the amount of full-length product Fluoride-stable polystyrene based supports or PEG are 
preferred. 

Preferred carriers have the general formula (R-3) provided below, (hi that structure 
preferred backbone attachment points can be chosen from R* or R^; R^ or R*; or R^ and R if 
Y is CRV (two positions are chosen to give two backbone attachment points, e.g., R^ and 
R'^, or R"^ and R^. Preferred tethering attachment points include R ; R or R when X is CH2. 
The carriers are described below as an entity, which can be incorporated into a strand. Thus, 
it is understood that the structures also encompass the situations wherein one (in the case of a 
terminal position) or two (in the case of an internal position) of the attachment points, e.g., R 
or r2; R^ or R''; or R^ or R^*^ (when Y is CR^R^*^, is connected to the phosphate, or modified 
phosphate, e.g., sulfur containing, backbone. E.g., one of the above-named R groups can be - 
CH2-, wherein one bond is connected to the carrier and one to a backbone atom, e.g., a 
linking oxygen or a central phosphorus atom.) 
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R 




7\ 



Y 




4 



(R-3) 



5 



X is N(CO)R^ NR'' or CH2; Y is NR^ O, S, CR^R^°; and Z is CR^^R^^ or absent. 

Each of R', R^ R\ R\ R\ and R'° is, independently, H, OR^ or (CH2)nOR^ provided 
that at least two of R', R^, R^ R*, R^ and R^" are OR^ and/or (CH2)nOR''. 

Each of R^, R^, r'^ and r'^ is, independently, a ligand, H, Ci-Cg alkyi optionally 
10 substituted with 1 -3 R^^ or C(0)NHR^; or R^ and R' ^ together are C3-C8 cycloalkyl 
optionally substituted with R^'*. 

R^ is H, a ligand, or C1-C20 aUcyl substituted with NR'^R''; R^ is H or Ci-Cg alkyl; R^^ 
is hydroxy, C1-C4 alkoxy, or halo; R^"* is NR'R'; R^^ is Ci-Ce alkyl optionally substituted 
with cyano, or C2-C6 alkenyl; R^^ is C]-Cio alkyl; and R^^ is a liquid or solid phase support 
1 5 reagent. 

L is -C(0)(CH2)<jC(0)-, or -C(0)(CH2)qS-; R^ is CAry, R*' is P(0)(0-)H, 
P(0R'5)N(R'^)2 or L-R''; R= is H or Ci-Cg alkyl; and R'" is H or a ligand. 

Each Ar is, independently, Cs-Cio aryl optionally substituted with C1-C4 alkoxy; n is 
1-4; and q is 0-4. 

20 Exemplary carriers include those in which, e.g., X is N(C0)R' or NR^, Y is CR^^", 

and Z is absent; or X is N(C0)R'' or NR\ Y is CR^R^", and Z is CR^'R^^; or X is N(CO)R^ or 
NRJ, Y is NR^ and Z is CR^^R^^; or X is N(CO)R^ or NR^ Y is O, and Z is CR^^R^^. ^ is 
CH2; Y is CR^R^°; Z is CR"r'^ and R^ and R" together form Ce cycloalkyl (H, z = 2), or 
the indane ring system, e.g., X is CH2; Y is CR^R^°; Z is CR"R^^, and R^ and R^^ together 

25 form C5 cycloalkyl (H, z = 1). 
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In certain embodiments, the carrier may be based on the pyrroUne ring system or the 
S-hydroxyproline ring system, e.g., X is N(CO)R^ or mC, Y is CR^R^^ and Z is absent (D). 
OFG^ is preferably attached to a primary carbon, e.g., an exocyclic alkylene 



group, e.g., a methylene group, coimected to one of the carbons in the five-membered ring (- 
CHzOFG^ in D). OFG^ is preferably attached directly to one of the carbons in the five- 
membered ring (-OFG^ in D). For the pyrroline-based carriers, -CH20FG^ may be attached 
to C-2 and OFG^ may be attached to C-3; or -CHzOFG^ may be attached to C-3 and OFG 
may be attached to C-4. . In certain embodiments, CH20FG^ and OFG^ may be geminally 
substituted to one of the above-referenced carbons.For the 3-hydroxyproline-based carriers, - 
CHzOFG^ may be attached to C-2 and OFG^ may be attached to C-4. The pyrroline- and 3- 
hydroxyproline-based monomers may therefore contain linkages (e.g., carbon-carbon bonds) 
wherein bond rotation is restricted about that particular linkage, e.g. restriction resulting from 
the presence of a ring. Thus, CHaOFG^ and OFG^ may be cis or trans with respect to one 
another in any of the pairings delineated above Accordingly, all cis/u^ans isomers are 
expressly included. The monomers may also contain one or more asymmetric centers and 
thus occur as racemates and racemic mixtures, single enantiomers, individual diastereomers 
and diastereomeric mixtures. All such isomeric forms of the monomers are expressly 
included. The tethering attachment point is preferably nitrogen. 

In certain embodiments, the carrier may be based on the piperidine ring system (E), 
e.g., X is N(CO)R^ or NR^ Y is CR^R^^ and Z is CR^'r'I OFG^ is preferably 



OFG^ 




LIGAND 



D 
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LIGAND 
E 

attached to a primary carbon, e.g., an exocyclic alkylene group, e.g., a methylene group (n=l) 
or ethylene group (n=2), connected to one of the carbons in the six-niembered ring [- 
5 (CH2)nOFG^ in E]. OFG^ is preferably attached directly to one of the carbons in tlie six- 

membered ring (-OFG^ in E). -(CH2)nOFG^ and OFG^ may be disposed in a geminal maimer 
on the ring, i.e., both groups may be attached to the same carbon, e.g*, at C-2, C-3, or C-4. 
Alternatively, -(CH2)nOFG^ and OFG^ may be disposed in a vicinal maimer on the ring, i.e., 
both groups may be attached to adjacent ring carbon atoms, e.g., -(CH2)nOFG^ may be 

10 attached to C-2 and OFG^ may be attached to C-3; -(CH2)nOFG^ may be attached to C-3 and 
OFG^ may be attached to C-2; -(CH2)nOFG^ may be attached to C-3 and OFG^ may be 
attached to C-4; or -(CH2)nOFG^ may be attached to C-4 and OFG^ may be attached to C-3. 
The piperidine-based monomers may therefore contain linkages (e.g., carbon-carbon bonds) 
wherein bond rotation is restricted about that particular linkage, e.g. restriction resulting from 

15 the presence of a ring. Thus, -(CH2)nOFG^ and OFG^ may be cis or trans with respect to one 
anotlier in any of the pairings delineated above. Accordingly, all cis/trans isomers are 
expressly included. The monomers may also contain one or more asymmetric centers and 
thus occur as racemates and racemic mixtures, single enantiomers, individual diastereomers 
and diastereomeric mixtures. All such isomeric forms of the monomers are expressly 

20 included. The tethering attachment point is preferably nitrogen. 
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In certain embodiments, the carrier may be based on the piperazine ring system (F), 
e.g., X is N(CO)R^ or NR*^, Y is NR^ and Z is CR^^R^^, or the morpholine ring system (G), 
e.g., X is N(CO)R^ or NR^ Y is O, and Z is CR^^R^l OFG^ is preferably 



Ml 



N 



OFG" 



N 



C3 

-j CH2OFG'' 

C2 



LIGAND 



.0. 



OFG' 



CH2OFG 



LIGAND 



G 



attached to a primary carbon, e.g., an exocycHc alkylene group, e.g., a methylene group, 
connected to one of the carbons in tlie six-membered ring (-CH20FG^ in F or G). OFG^ is 
preferably attached directly to one of the carbons in the six-membered rings (-OFG^ in F or 
G). For both F and G, -CH20FG^ may be attached to C-2 and OFG^ may be attached to C-3; 
or vice versa. In certain embodiments, CHiOFG^ and OFG^ may be geminally substituted to 
one of the above-referenced carbons.The piperazine- and morpholine-based monomers may 
therefore contain linkages (e.g., carbon-caxbon bonds) wherein bond rotation is restricted 
about that particular linkage, e.g. restriction resulting from the presence of a ring. Thus, 
CH2OFG and OFG may be cis or trans with respect to one another in any of the pairings 
delineated above. Accordingly, all cis/trans isomers are expressly included. The monomers 
may also contain one or more asymmetric centers and thus occur as racemates and racemic 
mixtures, single enantiomers, individual diastereomers and diastereomeric mixtures. All 
such isomeric forms of tlie monomers are expressly included. R' ' ' can be, e.g., Ci-Ce alkyl, 
preferably CH3. The tethering attachment point is preferably nitrogen in both F and G. 
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In certain embodiments, the carrier may be based on the decalin ring system, e.g., X 
is CH2; Y is CRV^; Z is CR^^R^^ and R^ and R^^ together form Ce cycloalkyl (H, z - 2\ or 
the indane ring system, e.g., X is CH2; Y is CR'R''; Z is CR^^R^', and R^ and R^^ together 
form C5 cycloalkyl (H, z - I). OFG^ is preferably attached to a primary carbon, 

5 



OFG^ 




e.g., an exocyclic methylene group (n=l) or ethylene group (n=2) connected to one of C-2, 
C-3, C-4, or C-5 [-(CH2)„OFG^ in H]. OFG^ is preferably attached directly to one of C-2, C- 
3, C-4, or C-5 (-OFG^ in H). -(CH2)nOFG^ and OFG^ may be disposed in a geminal manner 

10 on the ring, i.e., both groups may be attached to the same carbon, e.g., at C-2, C-3, C-4, or C- 
5. Altematively, -(CH2)nOFG^ and OFG^ may be disposed in a vicinal manner on the ring, 
i.e., both groups may be attached to adjacent ring carbon atoms, e.g., -(CH2)nOFG^ may be 
attached to C-2 and OFG^ may be attached to C-3; -(CH2)nOFG^ may be attached to C-3 and 
OFG^ may be attached to C-2; -(CH2)nOFG^ may be attached to C-3 and OFG^ may be 

1 5 attached to C-4; or -(CH2)nOFG^ may be attached to C-4 and OFG^ may be attached to C-3; - 
(CH2)nOFG^ may be attached to C-4 and OFG^ may be attached to C-5; or -(CH2)nOFG^ may 
be attached to C-5 and OFG^ may be attached to C-4. The decalin or indane-based 
monomers may therefore contain linkages (e.g., carbon-carbon bonds) wherein bond rotation 
is restricted about that particular linkage, e.g. restriction resulting from the presence of a ring. 

20 Thus, -(CH2)nOFG^ and OFG^ may be cis or trans with respect to one another in any of the 
pairings delineated above. Accordingly, all cis/trans isomers are expressly included. The 
monomers may also contain one or more asymmetric centers and thus occur as racemates and 
racemic mixtures, single enantiomers, individual diastereomers and diastereomeric mixtures. 
All such isomeric forms of the monomers are expressly included. In a preferred 

25 embodunent, the substituents at C-1 and C-6 are trans with respect to one another. The 
tethering attachment point is preferably C-6 or C-7. 
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Other carriers may include those based on 3-hydroxyproline (J). Thus, -(CH2)nOFG^ and 
OFG^ may be cis or trans with respect to one another. Accordingly, all cis/trans isomers are 
expressly included. The monomers may also contain one or more asymmetric centers 




LIGAND 



and thus occur* as racemates and racemic mixtures, single enantiomers, individual 
diastereomers and diastereomeric mixtures. All such isomeric forms of the monomers are 
expressly included. The tethering attachment point is preferably nitrogen. 
Representative carriers are shovm in FIG. 5. 

10 In certam embodiments, a moiety, e.g., a ligand may be connected indirectly to the 

carrier via the intermediacy of an intervening tether. Tethers are comiected to the carrier at 
the tethering attachment point (TAP) and may include any Ci-Cioo carbon-containing moiety, 
(e.g. C1-C75, C1-C50, C1-C20, Ci-Cio, Ci-Ce), preferably having at least one nitrogen atom. In 
preferred embodiments, the nitrogen atom forms part of a terminal amino group on the tether, 

15 which may serve as a comiection point for the ligand. Prefeixed tethers (underlined) include 
TAP -rCH2^nNH2 ; TAP- C(0¥CH9:)nNH9 ; or TAP- NR^ ' ' YCH^^nNH^, in which n is 1-6 and 
R"" is CrC6 alkyl. and R'^ is hydrogen or a ligand. In other embodiments, the nitrogen may 
form part of a terminal oxyamino group, e.g., -ONH2, or hydrazino group, -NHNH2. The 
tether may optionally be substituted, e.g., with hydroxy, alkoxy, perhaloalkyl, and/or 

20 optionally inserted with one or more additional heteroatoms, e.g., N, O, or S. Preferred 

tethered Hgands may include, e.g., TAP -(CH9 \ ^NHaiGAND\ 

TAP- CrO¥CH9)nNHrLIGAND\ or TAP- NR' ' ' ' rCH7 ^ .NHf LIGAND) ; 

TAP -(CH9 ^ nO^JHrLIGAND). TAP- C(OyCHAONH(LIGAND), or 

TAP- NR^ ^ ' ' fCH9^nONH( LIGAND^; TAP- rCHo^^NHNHofLIGANDl 

25 TAP- CrO¥CH9\NHNH2(LIGAND), or T AP^ NR- ' ^ ' rCHo^»NHNHofLIGAND\ 
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In other embodiments the tether may include an electrophilic moiety, preferably at the 
terminal position of the tether. Preferred electrophilic moieties include, e.g., an aldehyde, 
alkyl halide, mesylate, tosylate, nosylate, or brosylate, or an activated carboxylic acid ester, 
e.g. an NHS ester, or a pentafluorophenyl ester. Preferred tethers (underlined) include TAP- 
5 (CH7\CHO; TAP- C(0)(CH9^nCHQ : or TAP- NR^ ' ' ' (CH^^nCHQ, in which n is 1-6 and R'^'' 
is CrC6 alkyl; or TAP -rCH2\C(0)0NHS ; TAP- CCOXCH^ ) X(0)ONHS ; or 
TAP- NR' ' ' ' (CH^UCrOONHS. in which n is 1-6 and R"" is Ci-Ce alkyl; 
TAP-{CH2)nC£0}0C^F^; TAP -CfOVCH. ^ XfO) OC^Fs ; or TAP- NR' ' ' ' (CH7^ X(0) QC^Fs . 
inwhichnis 1-6 and R^^^^ is Ci-C^ alkyl; or -(CH2)nCH7LG; TAP -C(OXCH2)nCH2LG ; or 
1 0 TAP -NR' ' ' ' (CHiyCH^LG. in which n is 1 -6 and R^ " ' is Ci-Cg alkyl (LG can be a leaving 
group, e.g., halide, mesylate, tosylate, nosylate, brosylate). Tethering can be carried out by 

coupling a nucleophilic group of a ligand, e.g., a thiol or amino group with an electrophilic 

{ 

group on the tether. 

1 5 Tethered Entities 

A wide variety of entities cein be tethered to an iRNA agent, e.g., to the carrier of an 
RRMS. Examples are described below in the context of an RRMS but that is only preferred, 
entities can be coupled at other points to an iRNA agent. 

Preferred moieties are ligands, which are coupled, preferably covalently, either 

20 directly or indirectly via an intervening tether, to the RRMS carrier. In preferred 

embodiments, the ligand is attached to die carrier via an intervening tether. As discussed 
above, the ligand or tethered ligand may be present on the RRMS monomer when the RRMS 
monomer is incorporated into the growing strand, hi some embodiments, the ligand may be 
incorporated into a "precursor" RRMS after a "precursor" RRMS monomer has been 

25 incorporated into tlie growing strand. For example, an RRMS monomer having, e.g., an 
amino-terminated tether (i.e., having no associated ligand), e.g., TAP-(CH2)nNH2 may be 
incorporated into a growing sense or antisense strand. In a subsequent operation, i.e., after 
incorporation of the precursor monomer into the strand, a ligand having an electrophilic 
group, e.g., a pentafluorophenyl ester or aldehyde group, can subsequently be attached to the 

30 precursor RRMS by coupling the electrophilic group of the ligand with the terminal 
nucleophilic group of the precursor RRMS tether. 
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In preferred embodiments, a ligand alters the distribution, targeting or lifetime of an 
iRNA agent into which it is incorporated. In preferred embodiments a ligand provides an 
enhanced affinity for a selected target, e.g, molecule, cell or cell type, compartment, e.g., a 
cellular or organ compartment, tissue, organ or region of the body, as, e.g., compared to a 
5 species absent such a ligand. Preferred ligands will not take part in duplex pairing in a 
duplexed nucleic acid. 

Preferred ligands can improve transport^ hybridization, and specificity properties and 
may also improve nuclease resistance of the resultant natural or modified 
oligoribonucleotide, or a polymeric molecule comprising any combination of monomers 
1 0 described herein and/or natural or modified ribonucleotides. 

Ligands in general can include therapeutic modifiers, e.g., for enhancing uptake; 
diagnostic compounds or reporter groups e.g., for monitoring distribution; cross-linking 
agents; and nuclease-resistance conferrmg moieties. General examples include lipids, 
steroids, vitamins, sugars, proteins, peptides, polyamines, and peptide mimics. 
15 Ligands can include a naturally occurring substance, such as a protehi (e.g., human 

serum albumin (HSA), low-density lipoprotein (LDL), or globulin); carbohydrate (e.g., a 
dextran, pullulan, chitin, chitosan, inulin, cyclodextrin or hyaluronic acid); or a lipid. The 
ligand may also be a recombinant or synthetic molecule, such as a synthetic polymer, e.g., a 
synthetic polyamino acid. Examples of polyamino acids include polyamino acid is a 
20 polylysine (PLL), poly L-aspartic acid, poly L-glutamic acid, styrene-maleic acid anhydride 
copolymer, poly(L-lactide-co-glycolied) copolymer, divinyl ether-maleic anhydride 
copolymer, N-(2-hydroxypropyl)methacrylamide copolymer (HMPA), polyethylene glycol 
(PEG), polyvinyl alcohol (PVA), polyurethane, poly(2-ethylacryllic acid), N- 
isopropylacrylamide polymers, or polyphosphazine. Example of polyamines include: 
25 polyethylenimine, polylysine (PLL), spermine, spermidine, polyamine, pseudopeptide- 

polyamine, peptidomimetic polyamine, dendrimer polyamine, arginine, amidine, protamine, 
cationic lipid, cationic porphyrin, quatemary salt of a polyamine, or an alpha helical peptide. 

Ligands can also include targeting groups, e.g., a cell or tissue targeting agent, e.g., a 
lectin, glycoprotein, lipid or protein, e.g., an antibody, that binds to a specified cell type such 
30 as a cancer cell, endothelial cell, bone cell. A targeting group can be a thyrotropin, 

melanotropin, lectin, glycoprotein, surfactant protein A, Mucin carbohydrate, multivalent 
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lactose, multivalent galactose, N-acetyl-galactosamine, N-acetyl-gulucosamine multivalent 
mannose, multivalent fucose, glycosylated polyaminoacids, multivalent galactose, 
transferrin, bisphosphonate, polyglutamate, polyaspartate, a lipid, cholesterol, a steroid, bile 
acid, folate, vitamin B 12, biotin, or an RGD peptide or RGD peptide mimetic. 

Other examples of ligands include dyes, intercalating agents {e.g. acridines), cross- 
linkers (e.g. psoralene, mitomycin C), porphyrins (TPPC4, texaphyrin, Sapphyrin), 
polycyclic aromatic hydrocarbons (e.g., phenazine, dihydrophenazine), artificial 
endonucleases {e.g, EDTA), lipophiUc molecules, e.g, cholesterol, cholic acid, adamantane 
acetic acid, 1-pyrene butyric acid, dihydrotestosterone, l,3-Bis-0(hexadecyl)glycerQl, 
geranyloxyhexyl group, hexadecylglycerol, borneol, menthol, 1,3-propanediol, heptadecyl 
group, palmitic acid, myristic acid,03-(oleoyl)lithocholic acid, 03-(oleoyl)cholenic acid, 
dimethoxytrityl, or phenoxazine)and peptide conjugates (e.g., antennapedia peptide, Tat 
peptide), alkylating agents, phosphate, amino, mercapto, PEG (e.g., PEG-40K), MPEG, 
[MPEG]2, polyamino, alkyl, substituted alkyl, radiolabeled markers, enzymes, haptens (e.g. 
biotin), transport/absorption facilitators (e.g., aspirin, vitamin E, folic acid), synthetic 
ribonucleases (e.g., imidazole, bisimidazole, histamine, imidazole clusters, acridine- 
imidazole conjugates, Eu3+ complexes of tetraazamacrocycles), dinitrophenyl, HRP, or AP. 

Ligands can be proteins, e.g., glycoproteins, or peptides, e.g., molecules having a 
specific affinity for a co-ligand, or antibodies e.g., an antibody, that binds to a specified cell 
type such as a cancer cell, endotheUal cell, or bone cell. Ligands may also include hormones 
and hormone receptors. They can also include non-peptidic species, such as lipids, lectins, 
carbohydrates, vitamins, cofactors, multivalent lactose, multivalent galactose, N-acetyl- 
galactosamine, N-acetyl-gulucosamine multivalent mannose, or multivalent fucose. The 
ligand can be, for example, a lipopolysaccharide, an activator of p38 MAP kinase, or an 
activator of NF-kB. 

The ligand can be a substance, e.g, a drug, which can increase the uptake of the iRNA 
agent into the cell, for example, by disrupting the cell's cytoskeleton, e.g., by disrupting the 
cell's microtubules, microfilaments, and/or intermediate filaments. The di*ug can be, for 
example, taxon, vincristine, vinblastine, cytochalasin, nocodazole, japlakinolide, lati'unculin 
A, phalloidin, swinhoUde A, indanocine, or myoservin. 
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The ligand can increase the uptake of the iRNA agent mto the cell by activating an 
inflammatory response, for example. Exemplary ligands that would have such an effect 
include tumor necrosis factor alpha (TNFalpha), interleukin-1 beta, or gamma interferon. 

In one aspect, the ligand is a lipid or lipid-based molecule. Such a lipid or lipid- 
5 based molecule preferably binds a serum protein, e.g., h^lman serum albumin (HSA). An 
HSA binding ligand allows for distribution of the conjugate to a target tissue^ e.g., a non- 
kidney target tissue of the body. Preferably, the target tissue is the liver, preferably 
parenchymal cells of the liver. Other molecules that can bind HSA can also be used as 
ligands. For example, neproxin or aspirin can be used. A lipid or lipid-based ligand can (a) 
10 increase resistance to degradation of the conjugate, (b) increase targeting or transport into a 
target cell or cell membrane, and/or (c) can be used to adjust binding to a seru protein, e.g., 
HSA. 

A lipid based ligand can be used to modulate, e.g., control the bindmg of the 
conjugate to a target tissue. For example, a lipid or lipid-based ligand that binds to HSA 
1 5 more strongly will be less likely to be targeted to the kidney and therefore less likely to be 
cleared from the body. A lipid or lipid-based ligand that binds to HSA less strongly can be 
used to target the conjugate to the kidney. 

In a preferred embodiment, the lipid based ligand binds HSA. Preferably, it binds 
HSA with a sufficient affinity such that the conjugate will be preferably distributed to a non- 
20 kidney tissue. However, it is preferred that the affinity not be so strong that the HS A-ligand 
binding cannot be reversed. 

In another preferred embodiment, the lipid based ligand binds HSA weakly or not at 
all, such that the conjugate will be preferably distributed to the kidney. Other moieties that 
target to kidney cells can also be used in place of or in addition to the lipid based ligand. 
25 In another aspect, the ligand is a moiety, e.g., a vitamin, which is taken up by a target 

cell, e.g., a proliferating cell. These are particularly useful for treating disorders 
characterized by unwanted cell proliferation, e.g., of the malignant or non-malignant type, 
e.g., cancer cells. Exemplary vitamins include vitamin A, E, and K. Other exemplary 
vitamins include are B vitamin, e.g., folic acid, B12, riboflavin, biotin, pyridoxal or other 
30 vitamins or nutrients taken up by cancer cells. Also included ai'e HSA and low density 
lipoprotein (LDL). 
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In another aspect, the ligand is a cell-permeation agent, preferably a helical cell- 
permeation agent. Preferably, the agent is amphipathic. An exemplary agent is a peptide 
such as tat or antennopedia. If the agent is a peptide, it can be modified, including a 
peptidylmimetic, invertomers, non-peptide or pseudo-peptide linkages, and use of D-amino 
5 acids. The helical agent is preferably an alpha-helical agents which preferably has a 
lipophilic and a lipophobic phase. 

The ligand can be a peptide or peptidomimetic, A peptidomimetic (also referred to 
herein as an oligopeptidomimetic) is a molecule capable of folding into a defined tliree- 
dimensional structure similar to a natural peptide. The attachment of peptide and 
10 peptidomimetics to iRNA agents can affect pharmacokinetic distribution of the iRNA, such 
as by enhancing cellular recognition and absorption. The peptide or peptidomimetic moiety 
can be about 5-50 amino acids long, e.g., about 5, 10, 15, 20, 25, 30, 35, 40, 45, or 50 amino 
acids long (see Table 1, for example). 
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Table 1 . Exemplary Cell Permeation Peptides 



Cell 
Permeation 

Peptide 


Amino acid Sequence 


Reference 


Penetratin 


RQIKIWFQNRRMKWKK (SEQ IDNO:6737) 


Derossi et ah, J, Biol. 
Chem. 269:10444, 
1994 


Tat fragment 
(48-60) 


GRKKRRQRRRPPQC (SEQ IDNO:6738) 


Vives et ah, J. Biol. 
Chem., 272:16010, 
1997 


Signal 
Sequence- 
based peptide 


GALFLGWLGAAGSTMGAWSQPKKKRKV 
(SEQ ID NO:6738) 


Chaloin et al., 
Biochem. Biophys. 

Res. Commun., 
243:601, 1998 


PVEC 


LLIILRKRIRKQAHAHSK (SEQ ID NO: 673 9) 


Elmquist et ah, Exp. 
Cell Res., 269:237, 
2001 


Transportan 


GWTLNSAGYLLKINLKALAALAKKIL 
(SEQ ID NO: 6740) 


Pooga et al., FASEB 
J., 12:67, 1998 


Amphiphilic 
model peptide 


KLALKLALKALKAALKLA (SEQ ID 

NO: 6741) 


Oehlke et al., Mol. 
Ther., 2:339, 2000 


Arg9 


RRRRRRRRR (SEQ ID NO:6742) 


Mitchell et al., J. 
Pept. Res., 56:318, 

2000 


Bacterial cell 
wall 

permeating 


KFFKFFKFFK (SEQ ID NO:6743) 




LL-37 


LLGDFFRKSKEKIGKEFKRIVQRHCDFLRN 
LVPRTES (SEQ ID NO: 6744) 


■ " 


Cecropin P 1 


SWLSKTAKKLENSAKKRISEGIAIAIQGGP 
R (SEQ ID NO:6745) 




a-defensin 


ACYCRIPACIAGERRYGTCIYQGRLWAFC 
C (SEQ ID NO: 6746) 




b-defensin 


DHYNCVSSGGQCLYSACPIFTKIQGTCYR 

GKAKCCK (SEQ ID NO:6747) 




Bactenecin 


RKCRIVVIRVCR (SEQ ID NO:6748) 




PR-39 


RRRPRPPYLPRPRPPPFFPPRLPPRIPPGFPP 
RFPPRFPGKR-NH2 (SEQ ID NO:6749) 




Indolicidin 


ILPWKWPWWPWRR-NH2 (SEQ ID 
NO:6750) 





A peptide or peptidomimetic can be^ for example, a cell permeation peptide, cationic 
peptide, amphipathic peptide, or hydrophobic peptide (e.g-., consisting primarily of Tyr, Trp 
or Phe). The peptide moiety can be a dendrimer peptide, constrained peptide or crosslinked 
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peptide. In another alternative, the peptide moiety can include a hydrophobic membrane 
translocation sequence (MTS). An exemplary hydrophobic MTS-containing peptide is 
RFGF having the amino acid sequence AAVALLPAVLLALLAP (SEQ ID NO:6751). An 
RPGF analogue (e.g,, amino acid sequence AALLPVLLAAP (SEQ ID NO:6752)) 
containing a hydrophobic MTS can also be a targeting moiety. The peptide moiety can be a 
"delivery" peptide, which can carry large polar molecules including peptides, 
oligonucleotides, and protein across cell membranes. For example, sequences from the HIV 
Tat protein (GRKKRRQRRRPPQ (SEQ ID NO:6753)) and the Drosophila Antennapedia 
protein (RQIKIWFQNRRMKWKK (SEQ ID NO:6754)) have been found to be capable of 
functioning as delivery peptides. A peptide or peptidomimetic can be encoded by a random 
sequence of DNA, such as a peptide identified from a phage-display library, or one-bead- 
one-compound (OBOC) combinatorial library (Lam et aL, Nature, 354:82-84, 1991). 
Preferably the peptide or peptidomimetic tethered to an iRNA agent via an incorporated 
monomer unit is a cell targeting peptide such as an arginine-glycine-aspartic acid (RGD)- 
peptide, or RGD mimic. A peptide moiety can range in length from about 5 amino acids to 
about 40 amino acids. The peptide moieties can have a structural modification, such as to 
increase stability or direct confoimational properties. Any of the structural modifications 
described below can be utilized. 

An RGD peptide moiety can be used to target a tumor cell, such as an endothelial 
tumor cell or a breast cancer tumor cell (Zitzmarm et al. Cancer Res., 62:5139-43, 2002). 
An RGD peptide can facilitate targeting of an iRNA agent to tumors of a variety of other 
tissues, including the lung, kidney, spleen, or liver (Aoki et aL, Cancer Gene Therapy 8:783- 
787, 2001). The RGD peptide can be linear or cyclic, and can be modified, e.g., glycosylated 
or methylated to facilitate targeting to specific tissues. For example, a glycosylated RGD 
peptide can deliver an iRNA agent to a tumor cell expressing ayBs (Haubner et al. Jour. 
Nucl. Med., 42:326-336, 2001). 

Peptides that target markers enriched in prohferating cells can be used. E.g., RGD 
containing peptides and peptidomimetics can target cancer cells, in particular cells that 
exhibit an avPs integrin. Thus, one could use RGD peptides, cyclic peptides containing 
RGD, RGD peptides that include D-amino acids, as well as synthetic RGD mimics. In 
addition to RGD, one can use otlier moieties that target the a^^^ integrin ligand. Generally, 
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such ligands can be used to control proliferating cells and angiogeneis. Preferred conjugates 
of this type include an iRNA agent that targets PECAM-l, VEGF, or other cancer gene, e.g., 
a cancer gene described herein. 

A "cell permeation peptide" is capable of permeating a cell, e.g., a microbial cell, 
such as a bacterial or fungal cell, or a mammalian cell, such as a human cell. A microbial 
cell-permeating peptide can be, for example, an a-heUcal linear peptide {e.g., LL-37 or 
Ceropin PI), a disulfide bond-containing peptide (e.g., a -defensin, P-defensin or bactenecin), 
or a peptide containing only one or two dominating amino acids (e.g., PR-39 or indolicidin). 
A cell permeation peptide can also include a nuclear localization signal (NLS). For example, 
a cell permeation peptide can be a bipartite amphipathic peptide, such as MPG, which is 
derived from the fusion peptide domain of HIV- 1 gp41 and the NLS of SV40 large T antigen 
(Simeoni et al, Nucl. Acids Res. 3 1 :27 1 7-2724, 2003). 

In one embodiment, a targeting peptide tethered to an RRMS can be an amphipathic 
a-helical peptide. Exemplary amphipathic a-helical peptides include, but are not limited to, 
cecropins, lycotoxins, paradaxins, buforin, CPF, bombinin-like peptide (BLP), cathelicidins, 
ceratotoxins, S. clava peptides, hagfish intestinal antimicrobial peptides (HFIAPs), 
magainines, brevinins-2, dermaseptins, melittins, pleurocidin, H2A peptides, Xenopus 
peptides, esculentinis-1, and caerins. A number of factors will preferably be considered to 
maintain the integrity of helix stability. For example, a maximum number of helix 
stabiUzation residues will be utilized {e.g., leu, ala, or lys), and a minimum number helix 
destabilization residues will be utilized {e.g., proline, or cyclic monomeric units. The 
capping residue will be considered (for example Gly is an exemplary N-capping residue 
and/or C-terminal amidation can be used to provide an extra H-bond to stabiUze the helix. 
Formation of salt bridges between residues with opposite charges, separated by i =b 3, or i ± 4 
positions can provide stabiUty. For example, cationic residues such as lysine, arginine, 
homo-arginine, ornithine or histidine can form salt bridges with the anionic residues 
glutamate or aspartate. 

Peptide and petidomimetic ligands include those having natmally occurring or 
modified peptides, e.g., D or L peptides; a, P, or y peptides; N-methyl peptides; azapeptides; 
peptides having one or more amide, i.e., peptide, linkages replaced with one or more urea, 
thiourea, carbamate, or sulfonyl urea linkages; or cycUc peptides. 
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Methods for making iRNA agents 

iRNA agents can include modified or non-naturally occuring bases, e.g., bases 
described in copending and coowned United States Provisional Application Serial No. 
60/463,772 (Attorney Docket No. 14174-070P01), filed on April 17, 2003, which is hereby 
incorporated by reference and/or m copending and coowned United States Provisional 
Application Serial No. 60/465,802 (Attorney Docket No. 14174-074P01), filed on April 25, 
2003, which is hereby incorporated by reference. Monomers and iRNA agents which include 
such bases can be made by the methods found in United States Provisional Application Serial 
No. 60/463,772 (Attorney Docket No. 14174-070P01), filed on April 17, 2003, and/or in 
United States Provisional Application Serial No. 60/465,802 (Attorney Docket No. 14174- 
074P01), filed on April 25, 2003. 

In addition, the invention includes iRNA agents having a modified or non-naturally 
occuring base and another element described herein. E.g., the invention includes an iRNA 
agent described herein, e.g., a palindromic iRNA agent, an iRNA agent having a non 
canonical pairing, an iRNA agent which targets a gene described herein, e.g., a gene active in 
the liver, an iRNA agent having an architecture or structure described herein, an iRNA 
associated with an amphipathic delivery agent described herein, an iRNA associated with a 
drug delivery module described herein, an iRNA agent administered as described herem, or 
an iRNA agent formulated as described herein, which also incorporates a modified or non- 
naturally occuring base. 

The synthesis and purification of oUgonucleotide peptide conjugates can be 
performed by established methods. See, for example, Trufert et al. Tetrahedron, 52:3005, 
1996; and Manoharan, "OUgonucleotide Conjugates in Antisense Technology," in Antisense 
Drug Technology , ed. S.T. Crooke, Marcel Dekker, Inc., 2001. 

In one embodhnent of the invention, a peptidomunetic can be modified to create a 
constrained peptide that adopts a distinct and specific preferred conformation, which can 
increase the potency and selectivity of the peptide. For example, the constrained peptide can 
be an azapeptide (Gante, Synthesis, 405-413, 1989). An azapeptide is synthesized by 
replacing the a-carbon of an amino acid with a nitrogen atom without changmg the structure 
of the amino acid side chain. For example, the azapeptide can be synthesized by using 
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hydrazine in traditional peptide synthesis coupling methods, such as by reacting hydrazine 
with a "carbonyl donor," e.g., phenylchlorofonnate. 

In one embodiment of the invention, a peptide or peptidomimetic (e.g., a peptide or 
peptidomimetic tethered to an RRMS) can be an N-methyl peptide, N-methyl peptides are 
5 composed of N-methyl amino acids, which provide an additional methyl group in the 

peptide backbone, thereby potentially providing additional means of resistance to proteolytic 
cleavage. N-methyl peptides can by synthesized by methods known in the art (see, for 
example, Lindgren et aL^ Trends Pharmacol. Sci. 21:99, 2000; Cell Penetrating Peptides: 
Processes and Applications, Langel, ed., CRC Press, Boca Raton, FL, 2002; Fische et ah, 

10 Bioconjugate. Chem. 12: 825, 2001; Wander et al, J. Am. Chem. Soc, 124:13382, 2002). 
For example, an Ant or Tat peptide can be an N-methyl peptide. 

In one embodiment of the invention, a peptide or peptidomimetic (e.g., a peptide or 
peptidomimetic tethered to an RRMS) can be a p-peptide. |3-peptides form stable secondary 
structures such as hehces, pleated sheets, turns and hairpins in solutions. Their cyclic 

1 5 derivatives can fold into iianotubes in the solid state. P-peptides are resistant to degradation 
by proteolytic enzymes. |3-peptides can be synthesized by methods known in the art. For 
example, an Ant or Tat peptide can be a P-peptide. 

In one embodiment of the invention, a peptide or peptidomimetic (e.g^., a peptide or 
peptidomimetic tethered to an RRMS) can be a oligocarbamate. Oligocarbamate peptides are 

20 internalized into a cell by a transport pathway facilitated by carbamate transporters. For 
example, an Ant or Tat peptide can be an oligocarbamate. 

In one embodiment of the invention, a peptide or peptidomimetic (e.g., a peptide or 
peptidomimetic tethered to an RRMS) can be an oligourea conjugate (or an oligothiourea 
conjugate), in which the amide bond of a peptidomimetic is replaced with a urea moiety. 

25 Replacement of the amide bond provides increased resistance to degradation by proteolytic 
enzymes, e.g., proteolytic enzymes in the gastrointestinal tract. In one embodiment, an 
oligourea conjugate is tethered to an iRNA agent for use in oral delivery. The backbone in 
each repeating unit of an oligourea peptidomimetic can be extended by one carbon atom in 
comparison with tlie natural amino acid. The single carbon atom extension can increase 

30 peptide stability and lipophilicity, for example. An oligourea peptide can therefore be 

advantageous when an iRNA agent is directed for passage through a bacterial cell wall, or 
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when an iRNA agent must traverse the blood-brain barrier, such as for the treatment of a 
neurological disorder. In one embodiment, a hydrogen bonding lonit is conjugated to the 
oligourea peptide, such as to create an increased affinity with a receptor. For example, an 
Ant or Tat peptide can be an oligourea conjugate (or an oligothiourea conjugate). 

The siRNA peptide conjugates of the invention can be affiliated with, e.g., tethered 
to, RRMSs occurring at various positions on an iRNA agent. For example, a peptide can be 
terminally conjugated, on either the sense or the antisense strand, or a peptide can be 
bisconjugated (one peptide tethered to each end, one conjugated to the sense strand, and one 
conjugated to the antisense strand). In another option, the peptide can be internally 
conjugated, such as in the loop of a short hairpin iRNA agent. In yet another option, the 
peptide can be affiliated with a complex, such as a peptide-carrier complex. 

A peptide-carrier complex consists of at least a carrier molecule, which can 
encapsulate one or more iRNA agents (such as for delivery to a biological system and/or a 
cell), and a peptide moiety tethered to the outside of the carrier molecule, such as for 
targeting the carrier complex to a particular tissue or cell type. A carrier complex can carry 
additional targeting molecules on the exterior of the complex, or fusogenic agents to aid in 
cell delivery. The one or more iRNA agents encapsulated within the carrier can be 
conjugated to lipophilic molecules, which can aid in the delivery of the agents to the interior 
of the carrier. 

A carrier molecule or structure can be, for example, a micelle, a liposome (e.g., a 
cationic liposome), a nanoparticle, a microsphere, or a biodegradable polymer. A peptide 
moiety can be tethered to the caiTier molecule by a variety of Unkages, such as a disulfide 
linkage, an acid labile linkage, a peptide-based linkage, an oxyamino linkage or a hydrazine 
linkage. For example, a peptide-based luikage can be a GFLG peptide. Certain linkages will 
have particular advantages, and the advantages (or disadvantages) can be considered 
depending on the tissue target or intended use. For example, peptide based linkages are 
stable in the blood stream but are susceptible to enzymatic cleavage in tlie lysosomes. 

Targeting 

The iRNA agents of the invention are particularly useful when targeted to the liver. 
An iRNA agent can be targeted to the liver by incorporation of an RRMS containing a ligand 
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that targets the Uver. For example, a liver-targeting agent can be a lipophilic moiety. 
Preferred lipophilic moieties include lipid, cholesterols, oleyl, retinyl, or cholesteryl residues. 
Other lipophilic moieties that can function as liver-targeting agents include cholic acid, 
adamantane acetic acid, 1-pyrene butyric acid, dihydrotestosterone, 1,3-Bis- 
0(hexadecyl)glycerol, geranyloxyhexyl group^ hexadecylglycerol^ bomeol, menthol, 1,3- 
propanediol, heptadecyl group, pahnitic acid, myristic acid,03-(oleoyl)lithocholic acid, 03- 
(oleoyl)cholenic acid, dimethoxj^rityl, or phenoxazine. 

An iRNA agent can also be targeted to the liver by association with a low-density 
lipoprotein (LDL), such as lactosylated LDL. Polymeric carriers complexed with sugar 
residues can also function to target iRNA agents to the liver. 

A targeting agent that incorporates a sugar, ^.g-., galactose and/or analogues thereof, is 
particularly useful. These agents tai^get, in particular, tlie parenchymal cells of the liver. For 
example, a targeting moiety can include more than one or preferably two or three galactose 
moieties, spaced about 1 5 angstroms from each other. The targeting moiety can alternatively 
be lactose (e.g-., three lactose moieties), which is glucose coupled to a galactose. The 
targeting moiety can also be N-Acetyl-Galactosamme, N-Ac-Glucosamine. A mannose or 
mannose-6-phosphate targeting moiety can be used for macrophage targeting. 

Conjugation of an iRNA agent with a serum albumin (SA), such as human serum 
albumin, can also be used to target the iRNA agent to the liver. 

An iRNA agent targeted to the liver by an RRMS targeting moiety described herein 
can target a gene expressed in the liver. For example, the iRNA agent can target 
p21(WAFl/DIPl), P27(KIP1), the a-fetoprotein gene, beta-catenin, or c-MET, such as for 
treating a cancer of the liver. In another embodiment, the iRNA agent can target apoB-100, 
such as for the treatment of an HDL/LDL cholesterol imbalance; dyslipidemias, e.g., familial 
combined hyperlipidemia (FCHL), or acquired hyperlipidemia; hypercholesterolemia; statin- 
resistant hypercholesterolemia; coronary artery disease (CAD); coronary heart disease 
(CHD); or atherosclerosis. In another embodiment, the iRNA agent can target forkliead 
homologue in rhabdomyosarcoma (FKHR); glucagon; glucagon receptor; glycogen 
phosphorylase; PPAR-Gamma Coactivator (PGC-1); Fructose- 1,6-bisphosphatase; glucose- 
6-phosphatase; glucose-6-phosphate translocator; glucokinase inhibitory regulatory protein; 
or phosphoenolpyruvate carboxykinase (PEPCK), such as to inhibit hepatic glucose 
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production in a mammal, such as a human, such as for the treatment of diabetes. In another 
embodiment, an iRNA agent targeted to the liver can target Factor V, e.g., the Leiden Factor 
V allele, such as to reduce the tendency to form a blood clot. An iRNA agent targeted to the 
liver can include a sequence which targets hepatitis virus (e.g.. Hepatitis A, B, C, D, E, F, G, 
5 or H). For example, an iRNA agent of the invention can target any one of the nonstructural 
proteins of HCV: NS3, 4A, 4B, 5A, or 5B. For the treatment of hepatitis B, an iRNA agent 
can target the protein X (HBx) gene, for example. 

Preferred ligands on RRMSs include folic acid, glucose, cholesterol, cholic acid. 
Vitamin E, Vitamin K, or Vitamin A. 

10 Definitions 

The term "halo" refers to any radical of fluorine, chlorine, bromine or iodine. 
The term "alkyl" refers to a hydrocarbon chain that may be a straight chain or 
branched chain, containing the indicated number of carbon atoms. For example, C1-C12 alkyl 
indicates that the group may have from 1 to 12 (inclusive) carbon atoms in it. The term 

15 "haloalkyl" refers to an alkyl in which one or more hydrogen atoms are replaced by halo, and 
includes alkyl moieties in which all hydrogens have been replaced by halo (e.g., 
perfluoroalkyl). Alkyl and haloalkyl groups may be optionally inserted with O, N, or S. The 
terms "aralkyl" refers to an alkyl moiety in which an alkyl hydrogen atom is replaced by an 
aryl group. Aralkyl includes groups in which more than one hydrogen atom has been 

20 replaced by an aryl group. Examples of ^'aralkyl" include benzyl, 9-fluorenyl, benzhydryl, 
£ind trityl groups. 

The term "alkenyl" refers to a straight or branched hydrocarbon chain containing 2-8 
carbon atoms and characterized in having one or more double bonds. Examples of a typical 
alkenyl include, but not limited to, allyl, propenyl, 2-butenyl, 3-hexenyl and 3-octenyl 
25 groups. The term "alkynyl" refers to a straight or branched hydrocarbon chain containing 2-8 
carbon atoms and characterized in having one or more triple bonds. Some examples of a 
typical alkynyl are ethynyl, 2-propynyl, and S-methylbutynyl, and propargyl. The sp^ and 
sp^ carbons may optionally serve as the point of attachment of tiie alkenyl and alkynyl 
groups, respectively. 
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The term "alkoxy" refers to an -0-alkyl radical. The term "aminoalkyl" refers to an 
alkyl substituted with an aminoThe term "mercapto" refers to an -SH radical. The term 
"thioalkoxy" refers to an -S-alkyl radical. 

The term "alkylene" refers to a divalent alkyl (z.e,, -R-), e.g., -CHi-, -CH2CH2", and - 
5 CH2CH2CH2-. The term "alkylenedioxo" refers to a divalent species of the structure -O-R- 
O-5 in which R represents an alkylene. 

The term "aryi" refers to an aromatic monocyclic, bicyclic, or tricyclic hydrocarbon 
ring system, wherein any ring atom capable of substitution can be substituted by a 
substituent. Examples of aryl moieties include^ but are not limited to, phenyl, naphthyl, and 
10 anthracenyl. 

The term "cycloalkyl" as employed herein includes saturated cyclic, bicyclic, 
tricyclic,or polycyclic hydrocarbon groups having 3 to 12 carbons, wherein any ring atom 
capable of substitution can be substituted by a substituent. The cycloalkyl groups herein 
described may also contain fused rings. Fused rings are rings that share a common carbon- 

1 5 carbon bond. Examples of cycloalkyl moieties include, but are not limited to, cyclohexyl, 
adamantyl, and norbomyl. 

The term "heterocyclyl" refers to a nonaromatic 3-10 membered monocyclic, 8-12 
membered bicyclic, or 1 1-14 membered tricyclic ring system having 1-3 heteroatoms if 
monocyclic, 1-6 heteroatoms if bicyclic, or 1-9 heteroatoms if tricyclic, said heteroatoms 

20 selected from O, N, or S (e.g., carbon atoms and 1-3, 1-6, or 1-9 heteroatoms of N, O, or S if 
monocyclic, bicyclic, or tricyclic, respectively), wherein any ring atom capable of 
substitution can be substituted by a substituent. The heterocyclyl groups herein described 
may also contain fused rings. Fused rings are rings tlaat share a common carbon-carbon 
bond. Examples of heterocyclyl include, but are not limited to tetrahydrofuranyl, 

25 tetrahydropyranyl, piperidinyl, morpholino, pyrrolinyl and pjnrolidinyl. 

The term "heteroaryl" refers to an aromatic 5-8 membered monocyclic, 8-12 
membered bicyclic, or 1 1-14 membered tricyclic ring system having 1-3 heteroatoms if 
monocyclic, 1-6 heteroatoms if bicyclic, or 1-9 heteroatoms if tricyclic, said heteroatoms 
selected from O, N, or S (e.g., carbon atoms and 1-3, 1-6, or 1-9 heteroatoms of N, O, or S if 

30 monocyclic, bicyclic, or tricyclic, respectively), wherein any ring atom capable of 
substitution can be substituted by a substituent. 
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The term "oxo" refers to an oxygen atom, which forms a carbonyl when attached to 
carbon, an N-oxide when attached to nitrogen, and a sulfoxide or sulfone when attached to 
sulfur. 

The term "acyl" refers to an alkylcarbonyl, cycloalkylcarbonyl, arylcarbonyl, 
heterocyclylcarbonylj or heteroaryl carbonyl substituent, any of which may be farther 
substituted by substituents. 

The term "substituents" refers to a group "substituted" on an alkyl, cycloalkyl, 
alkenyl, alkynyl, heterocyclyl, heterocycloalkenyl, cycloalkenyl, aryl, or heteroaryl group at 
any atom of that group. Suitable substituents include, without limitation, alkyl^ alkenyl, 
alkynyl, alkoxy, halo, hydroxy, cyano, nitro, amino, SO3H, sulfate, phosphate, 
perfluoroalkyl, perfluoroalkoxy, methylenedioxy, ethylenedioxy, carboxyl, 0x0, thioxo, 
imino (alkyl, aryl, aralkyl), S(0)nalkyl (where n is 0-2), S(0)n aryl (where n is 0-2), S(0)n 
heteroaryl (where n is 0-2), S(0)n heterocyclyl (where n is 0-2), amine (mono-, di-, alkyl, 
cycloalkyl, aralkyl, heteroaralkyl, and combinations thereof), ester (alkyl, aralkyl, 
heteroaralkyl), amide (mono-, di-, alkyl, aralkyl, heteroaralkyl, and combinations thereof), 
sulfonamide (mono-, di-, alkyl, aralkyl, heteroaralkyl, and combinations thereof), 
unsubstituted aryl, unsubstituted heteroaryl, unsubstituted heterocyclyl, and unsubstituted 
cycloalkyl. In one aspect, the substituents on a group are independently any one single, or 
any subset of the eiforementioned substituents. 

The terms "adeninyl, cytosinyl, guaninyl, thyminyl, and uracilyl" and the like refer to 
radicals of adenine, cytosine, guEmine, thymine, and uracil. 

As used herein, an "unusual" nucleobase can include any one of the following: 

2-methylademnyl3 

N6-methyladeninyl, 

2-methylthio-N6-methyladeninyl, 

N6-isopentenyladeninyl, 

2-methyltliio-N6-isopentenyladeninyl, 

N6-(cis-hydroxyisopentenyl)adeninyl, 

2-methylthio-N6-(cis-hydroxyisopentenyl) adeninyl, 

N6-glycinylcarbamoyladeninyl, 

N6-threonylcarbamoyladeninyl, 
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2- methylthio-N6-thxeonyl carbamoyladeninyl, 
N6-methyl-N6-threonylcarbaaioyladeninyl, 
N6-hydroxynorvalylcarbainoyladeninyl, 
2"-niethylthio-N6-hydroxynorvalyl carbamoyladeninyl, 
N65N6"dimethyladeninyl5 

3- methylc}4osinyl, 
5-methylcytosinyl, 
2-thiocytosinyl, 

5 -formylcytosinyl. 



N4-inethylc3^sinyl5 

5 "hy droxymethylcytosiny 1, 

1 -methylguatiinyl, 

N2-methy 1 guaniny 1, 

7-inethylguaninyl, 

N25N2-dimethylguamnyl5 



NH 
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N2,N2,7-trimethylguamiiyl, 

1 -methylguaninyl, 

T-cyano-T-deazaguaninyl, 

T-aminomethyl-T-deazaguaninyl^ 

pseudotiracilyl, 

dihydrouracilylj 

5-methyliiracilyl, 

1 -methylpseudouracilyl, 

Z-thiouracilyl, 

4- thiouracilyl, 
2-thiothyininyl 

5 -methy 1-2-thiouracilyl, 

3 -(3 -ammo-3 -carboxypropyl)iiracilyl, 

5 -hy droxyuracily 1, 

5- methox3atracilyl, 
iiracilyl 5-oxyacetic acid, 

uracilyl 5-oxyacetic acid methyl ester, 

5-(carboxyhydroxyniethyl)uracilyl, 

5-(carboxyhydroxymethyl)uracilyl methyl ester, 

5-methoxycarbonylmethylui'acilyl, 

5-methoxycarbonylmetlayl-2-thiouracilyl, 

5-amiiiomethyl-2-thiouracilyl, 

5 -methy laminomethy luracilyl, 

5 -methylaminomethyl-2-thiouracilyl, 

5 -methylaminomethyl-2-selenoiiraciIyl, 

5-carbamoylmethyluracilyl, 

5"Carboxymethylaminomethyluracilyl, 

5"carboxymethylaminomethyl-2-thiouracilyl, 
3 -methy luracily 1, 

1 -methyl-3-(3-amino-3-caxboxypropyl) pseudouracilyl. 
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5 -carboxy methy lur acily 1, 
5-methyldihydrouracilyl, or 
3 -methylpseudouracilyl. 

Asymmetrical Modificatioas 

In one aspect, the invention features an iRNA agent which can be asynmietrically 
modified as described herein. 

In addition, the invention includes iRNA agents having asymmetrical modifications 
and another element described herein. E.g., the invention includes an iRNA agent described 
herein, e.g., a palindromic iRNA agent, an iRNA agent havhig a non canonical pairing, an 
iRNA agent wliich targets a gene described herein, e.g., a gene active in the liver, an iRNA 
agent having an architecture or structure described herein, an iRNA associated with an 
amphipathic delivery agent described herein, an iRNA associated with a drug delivery 
module described herein^ an iRNA agent administered as described herein, or an iRNA agent 
formulated as described herein, which also incorporates an asymmetrical modification. 

iRNA agents of the invention can be asymmetrically modified. An asymmetrically 
modified iRNA agent is one in which a strand has a modification which is not present on the 
other strand. An asymmetrical modification is a modification fomid on one strand but not on 
the other strand. Any modification, e.g., any modification described herein, can be present as 
an asymmetrical modification. An asymmetrical modification can confer any of the desired 
properties associated with a modification, e.g., those properties discussed herein. E,g,, an 
asymmetrical modification can: confer resistance to degradation, an alteration in half life; 
target the iRNA agent to a particular target, e.g., to a paiticular tissue; modulate, e.g., 
increase or decrease, the affinity of a strand for its complement or target sequence; or hinder 
or promote modification of a terminal moiety, e.g., modification by a kinase or other 
enzymes involved in the RISC mechanism pathway. The designation of a modification as 
having one property does not mean that it has no other property, e.g., a modification referred 
to as one which promotes stabilization might also enhance targeting. 

While not wishing to be bound by theory or any particular meclianistic model, it is 

believed that asymmetrical modification allows an iRNA agent to be optimized in view of the 

different or "asymmetrical" fimctions of the sense and antisense strands. For example, both 
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Strands can be modified to increase nuclease resistance^ however, since some changes can 
inhibit RISC activity, these changes can be chosen for the sense stand . In addition, since 
some modifications, e.g., targeting moieties, can add large bulky groups that, e.g., can 
interfere with the cleavage activity of the RISC complex, such modifications are preferably 
5 placed on the sense strand. Thus, targeting moieties, especially bulky ones (e.g. cholesterol), 
are preferentially added to the sense sti'and. In one embodiment, an asymmetrical 
modification in which a phosphate of the backbone is substituted with S, e.g., a 
phosphorothioate modification, is present in the antisense strand, and a 2' modification, e.g., 
2' OMe is present in the sense strand. A targeting moiety can be present at either (or both) 

10 the 5' or 3' end of the sense strand of the iRNA agent. In a preferred example, a P of the 

backbone is replaced with S in the antisense strand, 2'OMe is present in the sense strand, and 
a targeting moiety is added to either tlie 5' or 3' end of the sense strand of the iRNA agent. 

In a preferred embodiment an asymmetrically modified iRNA agent has a 
modification on the sense strand which modification is not foimd on the antisense strand and 

1 5 the antisense strand has a modification which is not found on the sense strand. 

Each strand can include one or more asymmetrical modifications. By way of 
example: one strand can include a first asymmetrical modification which confers a first 
property on the iRNA agent and the other strand can have a second asymmetrical 
modification which confers a second property on the iRNA. E.g., one strand, e.g., the sense 

20 strand can have a modification which targets the iRNA agent to a tissue, and the other strand, 
e.g., the antisense strand, has a modification which promotes hybridization witli the target 
gene sequence. 

In some embodiments both strands can be modified to optimize the same property, 
e.g., to increase resistance to nucleolytic degradation, but different modifications are chosen 
25 for the sense and the antisense strands, e.g., because the modifications affect other properties 
as well. E.g., since some changes can affect RISC activity these modifications are chosen for 
the sense strand. 

In an embodiment one strand has an asymmetrical 2' modification, e.g., a 2' OMe 
modification, and the other strand has an asyimnetrical modification of the phosphate 
30 backbone, e.g., a phosphorothioate modification. So, in one embodiment the antisense strand 
has an asymmetrical T OMe modification and the sense strand has an asymmetrical 
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phosphorothioate modification (or vice versa). In a particularly preferred embodiment the 
RNAi agent will have asymmetrical 2'-0 alkyi, preferably, 2'-OMe modifications on the 
sense strand and asymmetrical backbone P modification, preferably a phosphothioate 
modification in the antisense strand. There can be one or multiple 2'-OMe modifications, 
5 e.g., at least 2, 3, 4, 5, or 6, of the subimits of the sense strand can be so modified. There can 
be one or multiple phosphorothioate modifications, e.g.^ at least 2, 3, 4, 5, or 6^ of the 
subimits of the antisense strand can be so modified. It is preferable to have an iRNA agent 
wherein there are multiple 2'-OMe modifications on the sense strand and multiple 
phophorothioate modifications on the antisense strand. All of the subunits on one or both 

1 0 strands can be so modified. A particularly preferred embodiment of multiple asymmetric 
modification on both strands has a duplex region about 20-2 1, and preferably 19, subunits m 
length and one or two 3 ' overhangs of about 2 subunits in length. 

Asymmetrical modifications are usefid for promoting resistance to degradation by 
nucleases, e.g., endonucleases. iRNA agents can include one or more asymmetrical 

1 5 modifications which promote resistance to degradation. In preferred embodiments the 
modification on the antisense strand is one wliich will not interfere with silencing of the 
target, e.g., one which will not interfere with cleavage of the target. Most if not all sites on a 
strand are vulnerable, to some degree, to degradation by endonucleases. One can determine 
sites which are relatively vuhierable and insert asymmetrical modifications which inhibit 

20 degradation. It is often desirable to provide asymmetrical modification of a UA site in an 
iRNA agent, and in some cases it is desirable to provide the UA sequence on both strands 
with asymmetrical modification. Examples of modifications which inhibit endonucleolytic 
degradation can be found herein. Particularly favored modifications include: 2' 
modification, e.g., provision of a T OMe moiety on the U, especially on a sense strand; 

25 modification of the backbone, e.g., with the replacement of an O with an S, in the phosphate 
backbone, e.g., the provision of a phosphorothioate modification, on the U or the A or both, 
especially on an antisense strand; replacement of the U with a C5 amino linker; replacement 
of the A with a G (sequence changes are preferred to be located on the sense strand and not 
the antisense strand); and modification of the at the 2', 6% 1\ or 8' position. Preferred 

30 embodiments are those in which one or more of these modifications are present on the sense 
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but not the antisense strand, or embodiments where the antisense strand has fewer of such 
modifications. 

Asymmetrical modification can be used to inhibit degradation by exonucleases. 
Asymmetrical modifications can include those in which only one strand is modified as well 
as those in which both are modified. In preferred embodiments the modification on the 
antisense strand is one which will not interfere with silencing of the target, e.g., one which 
will not interfere with cleavage of the target. Some embodiments will have an asymmetrical 
modification on the sense strand, e.g., in a 3' overhang, e.g., at the 3' terminus, and on tlie 
antisense strand, e.g., in a 3' overhang, e.g., at the 3' terminus. If the modifications introduce 
moieties of different size it is preferable that the larger be on the sense strand. If the 
modifications introduce moieties of different charge it is preferable that the one with greater 
charge be on tlie sense strand. 

Examples of modifications which inhibit exonucleolytic degradation can be fomid 
herein. Particularly favored modifications include: 2' modification, e.g., provision of a 2' 
OMe moiety in a 3' overhang, e.g., at the 3' terminus (3' terminus means at the 3' atom of 
the molecule or at the most 3' moiety, e.g., the most 3' P or 2' position, as indicated by the 
context); modification of the backbone, e.g., with the replacement of a P with an S, e.g., the 
provision of a phosphorothioate modification, or the use of a methylated P in a 3' overhang, 
e.g., at the 3' terminus; combination of a T modification, e.g., provision of a 2' O Me 
moiety and modification of the backbone, e.g., with the replacement of a P with an S, e.g., 
the provision of a phosphorotiiioate modification, or the use of a methylated P, in a 3' 
overhang, e.g., at the 3' terminus; modification with a 3' alkyl; modification with an abasic 
pyrolidine in a 3' overhang, e.g., at the 3' terminus; modification wath naproxene, ibuprofen, 
or other moieties which inhibit degradation at tiie Y terminus. Preferred embodunents are 
those in which one or more of these modifications are present on the sense but not the 
antisense strand, or embodiments where the antisense strand has fewer of such modifications. 

Modifications, e.g., those described herein, which affect targeting can be provided as 
asymmetrical modifications. Targeting modifications which can inhibit silencing, e.g., by 
inhibiting cleavage of a target, can be provided as asymmetrical modifications of the sense 
strand. A biodistribution altering moiety, e.g., cholesterol, can be provided in one or more, 
e.g., two, asymmetrical modifications of the sense strand. Targetmg modifications which 
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introduce moieties having a relatively large molecular weight, e.g., a molecular weight of 
more than 400, 500, or 1 000 daltons, or which introduce a charged moiety (e.g., having more 
than one positive charge or one negative charge) can be placed on the sense strand. 

Modifications, e.g., those described herein, which modulate, e.g., increase or 
decrease, the affinity of a strand for its compliment or target, can be provided as 
asymmetrical modifications. These include: 5 methyl U; 5 methyl C; pseudouridine. Locked 
nucleic acids ,2 thio U and 2-amino-A. In some embodiments one or more of these is 
provided on the antisense strand. 

iRNA agents have a defined structure, with a sense strand and an antisense strand, 
and in many cases short single strand overhangs, e.g., of 2 or 3 nucleotides are present at one 
or both 3' ends. Asymmetrical modification can be used to optimize the activity of such a 
structure, e.g., by being placed selectively within the iRNA. E.g., the end region of tlie iRNA 
agent defined by the 5' end of the sense strand and the 3 'end of the antisense strand is 
important for function. This region can include the terminal 2, 3, or 4 paired nucleotides and 
any 3' overhang. In preferred embodiments asymmetrical modifications which result in one 
or more of the following are used: modifications of the 5' end of the sense strand which 
inhibit kinase activation of the sense strand, including, e.g., attachments of conjugates which 
target the molecule or the use modifications which protect against 5' exonucleolytic 
degradation; or modifications of either strand, but preferably the sense strand, which enhance 
binding between the sense and antisense strand and thereby promote a "tight" structm^e at this 
end of the molecule. 

The end region of the iRNA agent defined by the 3' end of tiie sense strand and the 
5 'end of the antisense strand is also important for function. This region can mclude the 
terminal 2, 3, or 4 paired nucleotides and any 3' overhang. Preferred embodiments include 
asymmetrical modifications of either strand, but preferably the sense strand, which decrease 
binding between the sense and antisense strand and thereby promote an "open" structure at 
this end of the molecule. Such modifications include placing conjugates which target the 
molecule or modifications which promote nuclease resistance on the sense strand in this 
region. Modification of the antisense strand which inhibit kinase activation are avoided in 
preferred embodiments. 
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Exemplary modifications for asymmetrical placement in the sense strand include the 
following: 

(a) backbone modifications, e.g., modification of a backbone P, including 
replacement of P with S, or P substituted with alkyl or allyl, e.g,. Me, and dithioates (S-P=S); 
these modifications can be used to promote nuclease resistance; 

(b) 2'-0 alkyl, e.g., 2'-OMe, 3'-0 alkyl, e.g., 3^-OMe (at terminal and/or internal 
positions); these modifications can be used to promote nuclease resistance or to enhance 
binding of the sense to the antisense strand, the 3' modifications can be used at the 5' end of 
the sense strand to avoid sense strand activation by RISC; 

(c) 2'-5' linkages (with 2'-H, 2'-OH and 2'-OMe and with P-O or P-S) these 
modifications can be used to promote nuclease resistance or to inhibit binding of the sense to 
the antisense strand, or can be used at the 5' end of tiie sense strand to avoid sense strand 
activation by RISC; 

(d) L sugars (e.g., L ribose, L-arabmose witli 2'-H, 2'-OH and 2'-OMe); these 
modifications can be used to promote nuclease resistance or to inhibit binding of the sense to 
the antisense strand, or can be used at the 5' end of the sense strand to avoid sense strand 
activation by RISC; 

(e) modified sugars (e.g., locked nucleic acids (LNA's), hexose nucleic acids 
(HNA's) and cyclohexene nucleic acids (CeNA's)); these modifications can be used to 
promote nuclease resistance or to inhibit binding of the sense to the antisense strand, or can 
be used at the 5' end of the sense strand to avoid sense strand activation by RISC; 

(f) nucleobase modifications (e.g., C-5 modified pyrunidmes, N-2 modified purines, 
N-7 modified purines, N-6 modified purines), these modifications can be used to promote 
nuclease resistance or to enhance bmding of the sense to the antisense strand; 

(g) cationic groups and Zwitterionic groups (preferably at a terminus), these 
modifications can be used to promote nuclease resistance; 

(h) conjugate groups (preferably at terminal positions), e,g., naproxen, biotin, 
cholesterol, ibuprofen, folic acid, peptides, and carbohydrates; these modifications can be 
used to promote nuclease resistance or to target the molecule, or can be used at the 5' end of 
the sense strand to avoid sense strand activation by RISC. 
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Exemplary modifications for asymmetrical placement in the antisense strand include 
the following: 

(a) backbone modifications, e.g., modification of a backbone P, including 
replacement of? with S, or P substituted with alkyl or allyl, e.g., Me, and dithioates (S-P=S); 

(b) 2'-0 alkyl, e.g., 2'-OMe5 (at terminal positions); 

(c) 2'-5' linkages (with 2'-H, 2'-OH and 2'-OMe) e.g., terminal at the 3' end); e.g., 
with P=0 or P=S preferably at tlie 3 '-end, these modifications are preferably excluded from 
the 5' end region as tliey may interfere with RISC enzyme activity such as kinase activity; 

(d) L sugars (e.g, L ribose, L-arabinose witli 2'-H, 2' -OH and 2'-OMe); e.g., terminal 
at the 3' end; e.g., withP=0 or P=S preferably at the 3 '-end, these modifications are 
preferably excluded from the 5' end region as tliey may interfere with kmase activity; 

(e) modified sugars (e.g., LNA's, HNA's and CeNA's); these modifications are 
preferably excluded from the 5' end region as they may contribute to unwanted 
enhancements of paring between the sense and antisense strands, it is often preferred to have 
a "loose" structure in the 5' region, additionally, they may interfere with kinase activity; 

(f) nucleobase modifications (e.g., C-5 modified pyrimidines, N-2 modified purines, 
N-7 modified purines, N-6 modified purines); 

(g) cationic groups and Zwitterionic groups (preferably at ateraiinus); 

conjugate groups (preferably at terminal positions), e,g., naproxen, biotin, cholesterol, 
ibuprofen, folic acid, peptides, and carbohydrates, but bulky groups or generally groups 
which inhibit RISC activity should are less preferred. 

The 5' -OH of the antisense strand should be kept free to promote activity. In some 
preferred embodiments modifications that promote nuclease resistance should be included at 
the 3' end, particularly in the 3' overhang. 

In another aspect, the invention features a method of optimizing, e.g., stabilizing, an 
iRNA agent. The method includes selecting a sequence having activity, mtroducing one or 
more asymmetric modifications into the sequence, wherein the introduction of the 
asymmetric modification optimizes a property of the iRNA agent but does not result in\a 
decrease in activity. 

The decrease in activity can be less than a preselected level of decrease. In 
preferred embodiments decrease in activity means a decrease of less than 5, 10, 20, 40, or 
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50 % activity, as compared with aii otherwise similar iRNA lacking the introduced 
modification. Activity can, e.g., be measured in vivo, or in vitro, with a result in either being 
sufficient to demonstrate the required maintenance of activity. 

The optimized propeiiy can be any property described herein and in particular the 
properties discussed in the section on asymmetrical modifications provided herein. The 
modification can be any asynmietrical modification, e,g., an asymmetric modification 
described in the section on asymmetrical modifications described herein. Particularly 
preferred asymmetric modifications are T-O alkyl modifications, e.g., 2'-OMe 
modifications, pailicularly in the sense sequence, and modifications of a backbone O, 
particularly phosphorothioate modifications, in the antisense sequence. 

In a preferred embodiment a sense sequence is selected and provided with an 
asymmetrical modification, while in other embodiments an antisense sequence is selected 
and provided with an asymmetrical modification. In some embodiments both sense and 
antisense sequences are selected and each provided with one or more asymmetrical 
modifications. 

Multiple asymmetric modifications can be introduced into either or both of the sense 
and antisense sequence. A sequence can have at least 2, 4, 6, 8, or more modifications and 
all or substantially all of the monomers of a sequence can be modified. 
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Table: 2. Some examples of Asymmetric Modification 

This table shows examples having straiid I with a selected modification and strand II 
5 with a selected modification. 



otranci i 


Strand II 


iNULfic/d^c ivcbisuuice v^e.g. -vJivie^ 


BioaistriDUtion (e.g.^ P=S) 


Biodistribution conjugate 
(e.g. Lipophile) 


Protein Binding Functionality 
(e.g. Naproxen) 


Tissue Distribution Functionality 
(e.g. Carbohydrates) 


Cell Targeting Fimctionality 
(e.g. Folate for cancer cells) 


Tissue Distribution Functionality 
(e.g. Liver Cell Targeting 
Carbohydrates) 


Fusogenic Fimctionality 
(e.g. Polyethylene imines) 


Cancer Cell Targeting 
(e. g. RGD peptides and imines) 


Fusogenic Fimctionality 
(e.g. peptides) 


Nuclease Resistance (e.g. 2'-OMe) 


Increase in binding Affinity (S-Me-C, 5-Me-U, 2- 
thio-U, 2-amino-A, G-clamp, LNA) 


Tissue Distribution Functionality 


RISC activity improving Functionality 


Helical confonnation changing 
Functionalities 


Tissue Distribution Functionality 
(P=S; lipophile, carbohydrates) 



144 



wo 2004/080406 



PCT/US2004/007070 



Z-X-Y Architecture 

In one aspect, the invention features an iRNA agent which can have a Z-X-Y 
architecture or structure such as those described herein and those described in copending, co- 
owned United States Provisional AppUcation Serial No. 60/510,246 (Attorney Docket No. 
14174-079P02), filed on October 9, 2003, which is hereby incorporated by reference, and in 
copending, co-owned United States Provisional Application Serial No. 60/510,318 (Attorney 
Docket No. 14174-079P03), filed on October 10, 2003, which is hereby incorporated by 
reference. 

In addition, tlie invention includes iRNA agents having a Z-X-Y structure and anotlier 
element described herein. E.g., the invention includes an iRNA agent described herein, e.g., 
a palindromic iRNA agent, an iRNA agent having a non canonical pairing, an iRNA agent 
which targets a gene described herein, e.g., a gene active in the liver, an iRNA associated 
witli an amphipathic delivery agent described herein, an iRNA associated with a drug 
delivery module described herein, an iRNA agent administered as described herein, or an 
iRNA agent formulated as described herein, which also incorporates a Z-X-Y architecture. 

The invention provides an iRNA agent having a first segment, the Z region, a second 
segment, the X region, and optionally a third region, the Y region: 

Z— X~Y. 

It may be desirable to modify subunits in one or both of Zand/or Y on one hand and X 
on the other hand. In some cases they will have the same modification or the same class of 
modification but it will more often be the case that the modifications made in Z and/or Y will 
differ from those made in X. 

The Z region typically includes a terminus of an iRNA agent. The length of the Z 

region can vary, but will typically be from 2-14, more preferably 2-10, subunits in length. It 

typically is single stranded, i.e., it will not base pair with bases of another strand, though it 

may in some embodiments self associate, e.g., to form a loop structure. Such structures can 

be formed by the end of a strand looping back and forming an intrastrand duplex. E.g., 2, 3, 
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4, 5 or more intra-strand bases pairs can form, having a looped out or coimecting region, 
typically of 2 or more subunits which do not pair. This can occur at one or both ends of a 
sti-and. A typical embodiment of a Z region is a single strand overhang, e.g., an over hang of 
the length described elsewhere herein. The Z region can thus be or include a 3' or 5' 
terminal single strand. It can be sense or antisense strand but if it is antisense it is preferred 
that it is a 3- overhang. Tj^pical inter-subunit bonds in the Z region include: P=0; P=S; S- 
P=S; P-NR2; and P-BR2. Chiral F=X, where X is S, N, or B) inter-subunit bonds can also be 
present. (These inter-subunit bonds are discussed in more detail elsewhere herein.) Otlier 
preferred Z region subunit modifications (also discussed elsewhere herein) can include: 3'- 
OR, 3'SR, 2'-OMe, 3'-OMe, and 2'OH modifications and moieties; alpha configuration 
bases; and 2' arabino modifications. 

The X region will in most cases be duplexed, in the case of a single strand iRNA 
agent, with a corresponding region of the single strand, or in the case of a double stranded 
iRNA agent, with the corresponding region of the other strand. The length of the X region 
can vary but will typically be between 10-45 and more preferably between 15 and 35 
subunits. Particularly preferred region X's include 17, 18, 19, 29, 21, 22, 23, 24, or 25 
nucleotide pairs, though other suitable lengths are described elsewhere herein and can be 
used. Typical X region subunits include 2' -OH subimits. In typical embodiments phosphate- 
inter-subunit bonds are preferred while phophorothioate or non-phosphate bonds are absent. 
Other modifications preferred in the X region include: modifications to improve binding, 
e.g., nucleobase modifications; cationic nucleobase modifications; and C-5 modified 
pyrimidines, e.g., allylamines. Some embodiments have 4 or more consecutive 2'OH 
subxmits. While the use of phosphorothioate is sometimes non preferred they can be used if 
they cormect less than 4 consecutive 2' OH subunits. 

The Y region will generally conform to the the parameters set out for the Z regions. 
However, the X and Z regions need not be the same, different types and numbers of 
modifications can be present, and infact, one will usually be a 3' overhang and one will 
usually be a 5' overhang. 

In a prefen-ed embodiment the iRNA agent will have a Y and/or Z region each having 
ribonucleosides in which the 2'-OH is substituted, e.g., witli 2'-OMe or other alkyl; and an X 
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region that includes at least four consecutive ribonucleoside subunits in which the 2' -OH 
remains unsubstituted. 

The subunit linkages (the linkages between subunits) of an iRNA agent can be 
modified, e.g., to promote resistance to degradation. Numerous examples of such 
5 modifications are disclosed herein, one example of which is the phosphorothioate linkage. 
These modifications can be provided bewteen the subunits of any of the regions, Y, X, and Z. 
However, it is preferred that their occureceis minimized and in particular it is preferred that 
consecutive modified linkages be avoided. 

In a preferred embodiment the iRNA agent will have a Y and Z region each having 
10 ribonucleosides in which the 2'-.OH is substituted, e.g., with 2'-OMe; and an X region that 
includes at least four consecutive subunits, e.g., ribonucleoside subunits in which the 2'-OH 
remains unsubstituted. 

As mentioned above, the subunit linkages of an iRNA agent can be modified, e.g., to 
promote resistance to degradation. These modifications can be provided between the 
15 subunits of any of the regions, Y, X, and Z. However, it is preferred that they are minimized 
and in particular it is preferred that consecutive modified linkages be avoided. 

Thus, in a preferred embodiment, not all of the subunit linkages of the iRNA agent 
are modified and more preferably the maximum number of consecutive subunits linked by 
other than a phospodiester bond will be 2, 3, or 4. Particulary preferred iRNA agents will not. 
20 have four or more consecutive subunits, e.g., 2'-hydroxyl ribonucleoside subunits, in which 
each subunits is joined by modified linkages - i.e. linkages that have been modified to 
stabilize them from degradation as compared to the phosphodiester linkages that naturally 
occur in RNA and DNA. 

It is particularly preferred to minimize tlie occurrence m region X. Thus, in preferred 
25 embodiments each of the nucleoside subunit linkages in X will be phosphodiester linkages, 
or if subunit linkages in region X are modified, such modifications will be minimized. E.g., 
although the Y and/or Z regions can include inter subunit linkages which have been 
stabilized against degradation, such modifications will be minimized in the X region, and in 
particular consecutive modifications will be minunized. Thus, in preferred embodiments the 
30 maximum number of consecutive subunits linked by other than a phospodiester bond will be 
2, 3, or 4. Particulary preferred X regions will not have four or more consecutive subunits, 
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e.g., 2'-hydroxyl ribonucleoside subunits, in which each subunits is joined by modified 
linkages - i.e. linkages that have been modified to stabilize them from degradation as 
compared to the phosphodiester linkages that naturally occur in RNA and DNA. 

In a preferred embodiment Y and /or Z will be free of phosphorothioate linkages, 
though either or both may contain other modifications, e.g,, other modifications of the 
subunit linkages. 

In a preferred embodiment region X, or in some cases, the entire iRNA agent, has no 
more than 3 or no more than 4 subunits having identical T moieties. 

In a preferred embodiment region X, or in some cases, the entire iRNA agent, has no 
more than 3 or no more than 4 subunits having identical subunit linkages. 

In a preferred embodiment one or more phosphorothioate Ihikages (or other 
modifications of the subunit linkage) are present in Y and/or Z, but such modified Imkages 
do not coimect two adjacent subxmits, e.g., nucleosides, having a T modification, e.g., a 2'- 
O-alkyl moiety. E.g., any adjacent 2'-0-alkyl moieties in the Y and/or Z, are connected by a 
linkage other than a a phosphorothioate linkage. 

In a preferred embodiment each of Y and/or Z independently has only one 
phosphorothioate linkage between adjacent subunits, e.g., nucleosides, having a T 
modification, e.g., 2'-0-alkyl nucleosides. If there is a second set of adjacent subunits, e.g., 
nucleosides, having a 2' modification, e.g., 2'-0-alkyl nucleosides, in Y and/or Z that 
second set is comiected by a linkage other than a phosphorothioate hnkage, e.g., a modified 
linkage other than a phosphorothioate linkage. 

In a prefered embodiment each of Y and/orZ independently has more than one 
phosphorothioate Imkage coimecting adjacent pairs of subunits, e.g., nucleosides, having a T 
modification, e.g., 2'-0-alkyl nucleosides, but at least one pah: of adjacent subunits, e.g., 
nucleosides, having a T modification, e.g., 2'-0-alkyl nucleosides, are be connected by a 
linkage otlier than a phosphorothioate linkage, e.g., a modified linkage other than a 

phosphorothioate linkage. 

In a prefered embodiment one of the above recited limitation on adjacent subunits in 
Y and or Z is combmed with a limitation on the subunits in X. E.g., one or more 
phosphorothioate linkages (or other modifications of the subunit linkage) are present in Y 
and/or Z, but such modified linkages do not connect two adjacent subunits, e.g., nucleosides, 

148 



wo 2004/080406 



PCT/US2004/007070 



having a T modification, e.g., a 2'-0"alkyl moiety. E.g., any adjacent 2'-0-alkyl moieties in 
the Y and/or Z, are connected by a linkage other than a a phosporothioate linkage. In 
addition, the X region has no more than 3 or no more than 4 identical subunits, e.g., subunits 
having identical 2' moieties or the X region has no more than 3 or no more than 4 subunits 
having identical subunit linkages. 

A Y and/or Z region can include at least one, and preferably 2, 3 or 4 of a 
modification disclosed herein. Such modifications can be chosen, independently, firom any 
modification described herein, e.g., from nuclease resistant subunits, subunits with modified 
bases, subunits with modified intersubunit linkages, subunits with modified sugars, and 
subunits linked to another moiety, e.g., a targeting moiety. In a preferred embodiment more 
than 1 of such subunits can be present but in some emobodiments it is prefered that no more 
than 1, 2, 3, or 4 of such modifications occur, or occur consecutively. In a preferred 
embodiment the fi-equency of the modification will differ between Yand /or Z and X, e.g., the 
modification will be present one of Y and/or Z or X and absent in the otlier. 

An X region can include at least one, and preferably 2, 3 or 4 of a modification 
disclosed herein. Such modifications can be chosen, independently, from any modification 
desribed herein, e.g., fi:om nuclease resistant subunits, subunits with modified bases, subunits 
with modified intersubunit linkages, subunits with modified sugars, and subunits linked to 
another moiety, e.g., a targeting moiety. In a preferred embodiment more than 1 of such 
subunits can b present but in some emobodiments it is prefered that no more than 1, 2, 3, or 4 
of such modifications occui', or occur consecutively. 

An RRMS (described elswhere herein) can be introduced at one or more points in one 
or both strands of a double-stranded iRNA agent. An RRMS can be placed in a Y and/or Z 
region, at or near (within 1, 2, or 3 positions) of the 3' or 5' end of the sense strand or at near 
(within 2 or 3 positions of) tlie 3' end of the antisense sti'and. In some embodiments it is 
preferred to not have an RRMS at or near (within 1, 2, or 3 positions of) the 5' end of the 
antisense strand. An RRMS can be positioned in tlie X region, and will preferably be 
positioned in the sense strand or in an area of the antisense strand not critical for antisense 
binding to the target. 
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Differential Modification of Terminal Duplex Stability 

In one aspect, the invention features an iRNA agent which can have differential 
modification of terminal duplex stability (DMTDS). 

In addition, the invention includes iRNA agents having DMTDS and another element 
described herein. E.g., the invention includes an iRNA agent described herein, e.g., a 
palindi*omic iRNA agent, an iRNA agent having a non canonical pairing, an iRNA agent 
which targets a gene described herein, e.g., a gene active in the liver, an iRNA agent having 
an architecture or structure described herein, an iRNA associated with an amphipathic 
delivery agent described herein, an iRNA associated with a drug delivery module described 
herein, an iRNA agent administered as described herein, or an iRNA agent formulated as 
described herein, which also incorporates DMTDS. 

iRNA agents can be optimized by increasing the propensity of the duplex to 
disassociate or melt (decreasing the free energy of duplex association), m the region of the 5' 
end of the antisense strand duplex. This can be accomplished, e.g., by the inclusion of 
subimits which increase the propensity of the duplex to disassociate or melt in the region of 
the 5' end of the antisense strand. It can also be accomplished by the attachment of a ligand 
that increases the propensity of the duplex to disassociate of melt in the region of the 5 'end . 
While not wishing to be bound by theory, the effect may be due to promoting the effect of an 
enzjone such as helicase, for example, promoting the effect of the enzyme in the proximity of 
the 5' end of the antisense strand. 

The inventors have also discovered tliat iRNA agents can be optimized by decreasing 
the propensity of the duplex to disassociate or melt (increasuig the free energy of duplex 
association), in the region of the 3' end of the antisense strand duplex. This can be 
accomplished, e.g., by the inclusion of subunits which decrease the propensity of the duplex 
to disassociate or melt in the region of the 3' end of the antisense strand. It can also be 
accompUshed by the attachment of Hgand that decreases the propensity of the duplex to 
disassociate of melt in the region of the 5 'end. 

Modifications which increase tlie tendency of the 5' end of the duplex to dissociate 
can be used alone or in combination with other modifications described herein, e.g., with 
modifications which decrease the tendency of the 3' end of tlie duplex to dissociate. 
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Likewise, modifications which decrease the tendency of the 3' end of the duplex to dissociate 
can be used alone or in combination with other modifications described herein, e.g., with 
modifications which increase the tendency of the 5' end of the duplex to dissociate. 

Decreasing the stability of the AS 5 ' end of the duplex 

Subunit pairs can be ranked on the basis of their propensity to promote dissociation or 
melting (e.g., on the free energy of association or dissociation of a particular pairing, the 
simplest approach is to examine the pairs on an individual pair basis, though next neighbor or 
similar analysis can also be used). In terms of promoting dissociation: 

A:U is preferred over G:C; 
G:U is preferred over G:C; 
I:C is preferred over G:C (I=inosine); 

mismatches, e.g., non-canonical or other than canonical pairings (as described 
elsewhere herein) are preferred over canonical (A:T, A:U, G:C) pairings; 

pairings which include a imiversal base are preferred over canonical pairings. 

A typical ds iRNA agent can be diagrammed as follows: 

S 5' R1N1N2N3N4N5 [N] N.5 N-3 N.2 N.i R2 3' 
AS 3' R3N1N2N3N4N5 [N] N.5 N_4 N.3 N.2 N.i R4 5' 

S:AS Pi P2 P3 P4 P5 [N] P-5P.4P-3P-2P-1 5' 

S indicates the sense strand; AS indicates antisense strand; Ri indicates an optional 

(and nonpreferred) 5' sense strand overhang; R2 indicates an optional (though preferred) 3' 

sense overhang; R3 indicates an optional (though preferred) 3' antisense sense overhang; R4 

indicates an optional (and nonpreferred) 5' antisense overhang; N indicates subunits; [N] 

indicates that additional subunit pairs may be present; and Px, indicates a paring of sense Nx 

and antisense N^. Overhangs are not shown in the P diagram. In some embodiments a 3 ' AS 

overhang corresponds to region Z, the duplex region corresponds to region X, and the 3' S 

strand overhang corresponds to region Y, as described elsewhere herein. (The diagram is not 
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meant to imply maximinn or minimum lengths, on which guidance is provided elsewhere 
herein.) 

It is preferred that pairings which decrease the propensity to form a duplex are used at 
1 or more of the positions in the duplex at the 5' end of the AS strand. The terminal pair (the 
most 5' pair in terms of the AS strand) is designated as P.i, and the subsequent pairing 
positions (going in the 3 ' direction in terms of the AS strand) in the duplex are designated. P. 
2, P-35 P-4, P-55 and so on. The preferred region in which to modify to modulate duplex 
formation is at P.5 through P.i, more preferably P.4 through P.i , more preferably P-3 through 
P,i. Modification at P.i, is particularly preferred, alone or with modification(s) other 
position(s), e.g., any of the positions just identified. It is preferred that at least 1, and more 
preferably 2, 3, 4, or 5 of the pairs of one of the recited regions be chosen independently 
from the group of: 

A:U 
G:U 
I:C 

mismatched pairs, e.g., non-canonical or other than canonical pairings or pairings 
which include a universal base. 

In preferred embodiments the change in subunit needed to achieve a pairing which 
promotes dissociation will be made in the sense strand, though in some embodiments the 
change will be made in the antisense strand. 

In a preferred embodiment the at least 2, or 3, of the pairs in P.i, through P-4, are pairs 
which promote disociation. 

In a preferred embodiment the at least 2, or 3, of the pairs in P.i, through P.4, are A:U. 

In a preferred embodiment the at least 2, or 3, of the pairs in P.i, through P.4, are G:U. 

In a preferred embodiment the at least 2, or 3, of the pairs in P.i, through P.4, are I:C. 

In a preferred embodiment the at least 2, or 3, of the pairs in P.i, through P.4, are 
mismatched pairs, e.g., non-canonical or other than canonical pairings pairings. 

In a preferred embodiment the at least 2, or 3, of the pairs in P.i, through P.4, are 
pairings which include a universal base. 
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Increasing the stability of the AS 3 ' end of the duplex 

Subxmit pairs can be ranked on the basis of their propensity to promote stability and 
inhibit dissociation or melting (e.g., on the free energy of association or dissociation of a 
particular pairing, the simplest approach is to examine the paurs on an individual pair basis, 
though next neighbor or similar analysis can also be used). In terms of promoting duplex 
stability: 

G:C is preferred over A:U 

Watson-Crick matches (A:T, A:U, G:C) are prefened over non-canonical or other 

than canonical pairings 

analogs that increase stability are preferred over Watson-Crick matches (A:T, A:U, 

G:C) 

2-amino-A:U is preferred over A:U 
2-thio U or 5 Me-thio-U: A are preferred over U:A 

G-clamp (an analog of C having 4 hydrogen bonds):G is preferred over C:G 
guanadinium-G-clamp : G is preferred over C : G 
psuedo uridine:A is preferred over U:A 

sugar modifications, e.g., 2' modifications, e.g., 2T, ENA, or LNA, which enhance 
binding are preferred over non-modified moieties and can be present on one or both strands 
to enhance stability of the duplex. It is preferred that pairings which increase the propensity 
to form a duplex are used at 1 or more of the positions in the duplex at the 3' end of the AS 
strand. The terminal pair (the most 3' pair in terms of tlie AS strand) is designated as Pi, and 
the subsequent pairing positions (going in the 5' direction in terms of the AS strand) in the 
duplex are designated, P2, P3, P4, P5, and so on. The preferred region m which to modify to 
modulate duplex formation is at P5 tlirough Pi, more preferably P4 through Pi , more 
preferably P3 through Pi. Modification at Pi, is particularly preferred, alone or with 
mdification(s) at other position(s), e.g.,any of the positions just identified. It is preferred that 
at least 1, and more preferably 2, 3, 4, or 5 of the pairs of the recited regions be chosen 
independently from the group of: 

G:C 
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a pair having an analog that increases stability over Watson-Crick matches (A:T, 

A:U, G:C) 

2-amino-A:U 
2-thio U or 5 Me-thio-U:A 

G-clamp (an analog of C having 4 hydrogen bonds):G 
guanadinium-G-clamp : G 
psuedo uiidineiA 

a pair in which one or both subunits has a sugar modification, e.g., a 2' 
modification, e.g., 2'F, ENA, or LNA, which enhance binding. 

In a preferred embodiment the at least 2, or 3, of the pairs in P.i, through P.4, ai^e pairs 
which promote duplex stability. 

In a preferred embodiment the at least 2, or 3, of the pairs in Pi, through P4, are G:C. 

In a preferred embodiment the at least 2, or 3, of the pairs in Pi, through P4, are a pair 
having an analog that increases stability over Watson-Crick matches. 

In a preferred embodiment the at least 2, or 3, of the pairs in Pi, tlirough P4, are 2- 

amino-A:U. 

In a preferred embodiment the at least 2, or 3, of the pairs in Pi, through P4, are 2-thio 
U or 5 Me-thio-U:A. 

In a preferred embodiment the at least 2, or 3, of the pairs in Pi, through P4, are G- 

clamp:G. 

In a preferred embodiment the at least 2, or 3, of the pairs in Pi, through are 
guanidinimn-G-clamp :G. 

In a preferred embodiment the at least 2, or 3, of the pairs in Pi, through P4, are 
psuedo uridine: A, 

In a preferred embodiment the at least 2, or 3, of the pairs in Pi, through P4, are a pair 
in which one or both subunits has a sugar modification, e.g., a 2' modification, e.g., 2'F, 
ENA, or LNA, which enliances binding. 

G-clamps and guanidinium G-clamps are discussed in the following references: 
Holmes and Gait, "The Synthesis of 2'-0-Methyl G-Clamp Containing Oligonucleotides and 
Their Inhibition of the HIV-1 Tat-TAR Interaction," Nucleosides, Nucleotides & Nucleic 
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Acids, 22:1259-1262, 2003; Holmes et al, "Steric inhibition of human immunodeficiency 
virus type-1 Tat-dependent trans-activation in vitro and in cells by oligonucleotides 
containing 2'-0-methyl G-clamp ribonucleoside analogues," Nucleic Acids Research, 
3 1 :2759-2768, 2003; Wilds, et al., "Structiiral basis for recognition of gvianosine by a 
synthetic tricyclic cytosine analogue: Guanidinium G-cIamp," Helvetica Chimica Acta, 
86:966-978, 2003; Rajeev, et al, "High-Affinity Peptide Nucleic Acid Oligomers 
Containing Tricychc Cytosine Analogues," Organic Letters, 4:4395-4398, 2002; Ausin, et 
al, "Synthesis of Amino- and Guanidino-G-CIamp PNA Monomers," Organic Letters, 
4:4073-4075, 2002; Maier et al., 'TSTuclease resistance of oligonucleotides containing the 
tricyclic cytosine analogues phenoxazine and 9-(2-aminoethoxy)-phenoxazine ("G-clamp") 
and origins of their nuclease resistance properties," Biochemistry, 41:1323-7, 2002; 
Flanagan, et al, "A cytosine analog that confers enhanced potency to antisense 
oligonucleotides," Proceedings Of The National Academy Of Sciences Of The United States 
Of America, 96:3513-8, 1999. 

Simultaneously decreasing the stability of the AS 5 'end of the duplex and increasing 
the stability of tiie AS 3' end of the duplex 

As is discussed above, an iRNA agent can be modified to both decrease tiie stability 
of tiie AS 5 'end of tbe duplex and increase the stability of tiie AS 3' end of the duplex. This 
can be effected by combining one or more of tiie stability decreasing modifications in the AS 
5' end of the duplex with one or more of the stability increasing modifications in tiie AS 3' 
end of the duplex. Accordingly a preferred embodiment mcludes modification in P.5 through 
P.i, more preferably P.4 through P.i and more preferably P.3 tiirough P.i. Modification at P.i, 
is particularly preferred, alone or witli other position, e.g., the positions just identified. It is 
preferred that at least 1 , and more preferably 2, 3 , 4, or 5 of the pairs of one of the recited 
regions of the AS 5' end of the duplex region be chosen independently from the group of: 

A:U 
G:U 
I:C 
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mismatched pairs, e.g., non-canonical or other than canonical pairings which 
include a universal base; and 

a modification in P5 through Pi, more preferably P4 through Pi and more preferably 
P3 through Pi, Modification at Pi, is particularly preferred, alone or with other position, e.g., 
the positions just identified. It is preferred that at least 1, and more preferably 2, 3, 4, or 5 of 
the pairs of one of the recited regions of the AS 3' end of the duplex region be chosen 
independently from the group of: 

G:C 

a pair having an analog that increases stability over Watson-Crick matches (A:T, 

A:U, G:C) 

2-amino-A:U 
2-thio U or 5 Me-thio-U: A 

G-clamp (an analog of C having 4 hydrogen bonds):G 
guanadinium-G-clamp : G 
psuedo uridine: A 

a pair in which one or both subunits has a sugar modification, e.g., a T 
modification, e.g., 2'F, ENA, or LNA, which enhance binding. 

The invention also includes methods of selecting and making iRNA agents having 
DMTDS. E.g., when screening a target sequence for candidate sequences for use as iRNA 
agents one can select sequences having a DMTDS property described herein or one which 
can be modified, preferably with as few changes as possible, especially to the 

AS strand, to provide a desired level of DMTDS. 

The invention also includes, providing a candidate lElNA agent sequence, and 
modifying at least one P in P.5 through P.i and/or at least one P in P5 through Pi to provide a 

DMTDS iRNA agent. 

DMTDS iRNA agents can be used in any metliod described herein, e.g., to silence 
any gene disclosed herein, to treat any disorder described herein, in any formulation 
described herein, and generally in and/or with the methods and compositions described 
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elsewhere herein. DMTDS iRNA agents can incorporate other modifications described 
herein, e.g., the attachment of targeting agents or the inclusion of modifications which 
enhance stability, e.g., the inclusion of nuclease resistant monomers or the inclusion of single 
strand overhangs (e.g., 3' AS overhangs and/or 3' S strand overhangs) which self associate to 
5 form intrastrand duplex structure. 

Preferably fliese iRNA agents will have an architecture described herein. 

Other Embodiments 
In vivo Delivery 

10 An iRNA agent can be linked, e.g., noncovalently linked to a polymer for the efficient 

delivery of the iRNA agent to a subject, e.g., a mammal, such as a human. The iRNA agent 
can, for example, be complexed with cyclodextrin. Cyclodextrins have been used as delivery 
vehicles of therapeutic compounds. Cyclodextrins can form inclusion complexes with drugs 
that are able to fit into the hydrophobic cavity of the cyclodextrin. In other examples, 

15 cyclodextrins form non-covalent associations with other biologically active molecules such 
as oligonucleotides and derivatives tliereof The use of cyclodextrins creates a water-soluble 
drug delivery complex, that can be modified with targeting or other functional groups. 
Cyclodextrin cellular delivery system for oligonucleotides described in U.S. Pat. No. 
5,691,316, which is hereby mcorporated by reference, are suitable for use in methods of the 

20 invention. In this system, an oligonucleotide is noncovalendy complexed with a 

cyclodextrin, or the oligonucleotide is covalently bomid to adamantine which in turn is non- 
covalently associated with a cyclodextrin. 

The delivery molecule can include a linear cyclodextrin copolymer or a linear 
oxidized cyclodextrin copolymer having at least one ligand bound to the cyclodextrin 

25 copolymer. Delivery systems , as described in U.S. Patent No. 6,509,323, herein 

incorporated by reference, are suitable for use in methods of the invention. An iRNA agent 
can be bound to the linear cyclodextrin copolymer and/or a linear oxidized cyclodextrin 
copolymer. Either or botli of the cyclodextrin or oxidized cyclodextrin copolymers can be 
crosslinked to another polymer and/or bound to a ligand. 

30 A composition for iRNA delivery can employ an "inclusion complex," a molecular 

compound having the characteristic structure of an adduct. In this structure, the "host 
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molecule" spatially encloses at least part of another compound in the delivery vehicle. The 
enclosed compound (the "guest molecule") is situated in the cavity of the host molecule 
without affecting the framework structure of the host. A "host" is preferably cyclodextrin, 
but can be any of the molecxiles suggested in U.S. Patent Publ. 2003/0008818, herein 

incorporated by reference. 

Cyclodextrins can interact with a variety of ionic and molecular species, and the 
resultuag inclusion compounds belong to the class of "host-guest" complexes. Within the 
host-guest relationship, the binding sites of the host and guest molecules should be 
complementary in the stereoelectronic sense. A composition of the invention can contain at 
least one polymer and at least one therapeutic agent, generally in the form of a particulate 
composite of the polymer and therapeutic agent, e.g., the iRNA agent. The iRNA agent can 
contain one or more complexing agents. At least one polymer of the particulate composite 
can interact with the complexing agent in a host-guest or a guest-host interaction to form an 
inclusion complex between the polymer and tlie complexing agent. The polymer and, more 
particularly, the complexing agent can be used to uitroduce functionality mto the 
composition. For example, at least one polymer of the particulate composite has host 
functionality and forms an inclusion complex with a complexing agent having guest 
functionality. Altematively, at least one polymer of the particulate composite has guest 
functionality and forms an inclusion complex with a complexing agent having host 
functionality, A polymer of the particulate composite can also contain both host and guest 
functionalities and form inclusion complexes with guest complexing agents and host 
complexing agents. A polymer with functionality can, for example, facilitate cell targeting 
and/or cell contact (e.g., targeting or contact to a liver cell), intercellular trafficking, and/or 

cell entry and release. 

Upon forming the particulate composite, the iRNA agent may or may not retain its 
biological or therapeutic activity. Upon release from the therapeutic composition, 
specifically, from the polymer of the particulate composite, the activity of the iRNA agent is 
restored. Accordingly, the particulate composite advantageously affords the iRNA agent 
protection against loss of activity due to, for example, degradation and offers enhanced 
bioavailability. Thus, a composition may be used to provide stability, particularly storage or 
solution stability, to an iRNA agent or any active chemical compound. The iRNA agent may 

158 



wo 2004/080406 



PCT/US2004/007070 



be further modified witli a ligand prior to or after particulate composite or therapeutic 
composition formation. The ligand can provide further functionahty. For example, the 
ligand can be a targeting moiety. 

Physiological Effects 

The iRNA agents described herein can be designed such that determining therapeutic 
toxicity is made easier by the complementarity of the iRNA agent with both a human and a 
non-human animal sequence. By these methods, an iRNA agent can consist of a sequence 
that is fully complementary to a nucleic acid sequence from a human and a nucleic acid 
sequence from at least one non-human animal, e,g., a non-human mammal, such as a rodent, 
ruminant or primate. For example, the non-human mammal can be a mouse, rat, dog, pig, 
goat, sheep, cow, monkey. Pan paniscus. Pan troglodytes, Macaca mulatto, or Cynomolgus 
monkey. The sequence of the iRNA agent could be complementary to sequences within 
homologous genes, e.g,, oncogenes or tumor suppressor genes, of the non-human mammal 
and the human. By determining the toxicity of the iRNA agent in the non-human mammal, 
one can extrapolate the toxicity of the iRNA agent in a human. For a more strenuous toxicity 
test, the iRNA agent can be complementary to a human and more than one, e.g., two or three 
or more, non-human animals. 

The methods described herein can be used to correlate any physiological effect of an iRNA 
agent on a human, e.g., any unwanted effect, such as a toxic effect, or any positive, or desired 
effect. 

Delivery Module 

In one aspect, the invention features a drug delivery conjugate or module, such as 
those described herein and those described in copending, co-owned United States Provisional 
Application Serial No. 60/454,265, filed on March 12, 2003, which is hereby incorporated by 
reference. 

In addition, the mvention includes iRNA agents described herem, e.g., a palmdromic 
iRNA agent, an iRNA agent hving a non canonical pairing, an iRNA agent which targets a 
gene described herein, e.g., a gene active in the liver, an iRNA agent having a chemical 
modification described herein, e.g., a modification which enhances resistance to degradation, 
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an iRNA agent having an architecture or structure described herein, an iRNA agent 
administered as described herein, or an iRNA agent formulated as described herein, 
combined with, associated with, and delivered by such a drug delivery conjugate or module. 

The iRNA agents can be complexed to a delivery agent that features a modular 
complex. The complex can include a carrier agent linked to one or more of (preferably two 
or more, more preferably all three of): (a) a condensing agent (e.g., an agent capable of 
attracting, e,g., binding, a nucleic acid, e.g., through ionic or electrostatic interactions); (b) a 
fusogenic agent (e.g., an agent capable of fusing and/or being transported through a cell 
membrane, e.g., an endosome membrane); and (c) a targeting group, e.g., a cell or tissue 
targetmg agent, e.g., a lectin, glycoprotein, lipid or protein, e.g., an antibody, that binds to a 
specified cell type such as a cancer cell, endotheUal cell or bone celL 

An iRNA agent, e.g., iRNA agent or sRNA agent described herein, can be Imked, 
e.g., coupled or bound, to the modular complex. The iRNA agent can interact with the 
condensing agent of the complex, and the complex can be used to deliver an iRNA agent to a 
cell, e.g., in vitro or in vivo. For example, the complex can be used to deliver an iRNA agent 
to a subject in need thereof, e.g., to deliver an iRNA agent to a subject having a disorder, e.g., 
a disorder described herein, such as a disease or disorder of the liver. 

The fusogenic agent and the condensing agent can be different agents or the one and 
the same agent. For example, a polyamino chain, e.g., polyethyleneimine (PEI), can be the 
fusogenic and/or the condensing agent. 

The deUvery agent can be a modular complex. For example, the complex can include 
a carrier agent linked to one or more of (preferably two or more, more preferably all three 
of): 

(a) a condensing agent (e.g., an agent capable of attracting, e.g., binding, a nucleic 
acid, e.g., tlirough ionic interaction), 

(b) a fusogenic agent (e.g., an agent capable of fusing and/or being transported 
through a cell membrane, e.g., an endosome membrane), and 

(c) a targeting group, e.g., a cell or tissue targeting agent, e.g., a lectin, glycoprotein, 
lipid or protein, e.g., an antibody, that binds to a specified cell type such as a cancer cell, 
endotheUal cell, bone cell. A targeting group can be a thyrotropin, melanotropin, lectin, 
glycoprotein, surfactant protein A, Mucm cai'bohydrate, multivalent lactose, multivalent 
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galactose, N-acetyl-galactosamine, N-acetyl-gulucosamine multivalent mannose, multivalent 
fiicose, glycosylated polyaminoacids, multivalent galactose, transferrin, bisphosphonate, 
polyglutamate, polyaspartate, a lipid, cholesterol, a steroid, bile acid, folate, vitamin B 12, 
biotin, Neproxin, or an RGD peptide or RGD peptide mimetic. 

Carrier agents 

The carrier agent of a modular complex described herein can be a substrate for 
attachment of one or more of: a condensing agent, a fusogenic agent, and a targeting group. 
The carrier agent would preferably lack an endogenous enzymatic activity. The agent would 
preferably be a biological molecule, preferably a macromolecule. Polymeric biological 
carriers are preferred. It would also be preferred that the carrier molecule be biodegradable.. 

The carrier agent can be a naturally occurring substance, such as a protein (e.g., 
human serum albumm (HSA), low-density lipoprotein (LDL), or globulin); carbohydrate 
(e.g., a dextran, puUulan, chitin, chitosan, inulin, cyclodextrin or hyaluronic acid); or lipid. 
The carrier molecule can also be a recombinant or synthetic molecule, such as a synthetic 
polymer, e.g., a synthetic poly amino acid. Examples of poly amino acids include poly lysine 
(PLL), poly L-aspartic acid, poly L-glutamic acid, styrene-maleic acid anhydride copolymer, 
poly(L4actide-co-glycolied) copolymer, divinyl ether-maleic anhydride copolymer, N-(2- 
hydroxypropyl)methacrylamide copolymer (HMPA), polyethylene glycol (PEG), polyvinyl 
alcohol (PVA), polyurethane, poly(2-ethylacryllic acid), N-isopropylacrylamide polymers, or 
polyphosphazine. Other useful carrier molecules can be identified by routine methods. 

A carrier agent can be characterized by one or more of: (a) is at least 1 Da in size; (b) 
has at least 5 charged groups, preferably between 5 and 5000 charged groups; (c) is present 
in the complex at a ratio of at least 1:1 carrier agent to fusogenic agent; (d) is present in the 
complex at a ratio of at least 1 : 1 carrier agent to condensing agent; (e) is present in the 
complex at a ratio of at least 1 : 1 carrier agent to targeting agent. 

Fusogenic agents 

A fusogenic agent of a modular complex described herein can be an agent that is 
responsive to, e.g., changes charge depending on, the pH enviromnent. Upon encountering 
the pH of an endosome, it can cause a physical change, e.g., a change in osmotic properties 
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which disrupts or increases the permeability of the endosome membrane. Preferably, the 
fusogenic agent changes charge, e.g., becomes protonated, at pH lower than physiological 
range. For example, the fusogenic agent can become protonated at pH 4.5-6,5. The 
fusogenic agent can serve to release the iRNA agent into the cytoplasm of a cell after the 
complex is taken up, e.g., via endocytosis, by the cell, tliereby increasmg the cellular 
concentration of the iRNA agent in the cell. 

In one embodiment, the fusogenic agent can have a moiety, e.g., an amino group, 
which, when exposed to a specified pH range, will undergo a change, e.g., in charge, e.g., 
protonation. The change in charge of the fusogenic agent can trigger a change, e.g., an 
osmotic change, in a vesicle, e.g., an endocytic vesicle, e.g., an endosome. For example, the 
fusogenic agent, upon being exposed to tlie pH environment of an endosome, will cause a 
solubility or osmotic change substantial enough to increase the porosity of (preferably, to 

rupture) the endosomal membrane. 

The fusogenic agent can be a polymer, preferably a polyamino chain, e.g., 
polyethyleneimine (PEI). The PEI can be linear, branched, synthetic or natural. The PEI can 
be, e.g., alkyl substituted PEI, or lipid substituted PEI. 

In other embodiments, the fusogenic agent can be polyhistidine, polyimidazole, 
polypyridine, polypropyleneimine, mellitin, or a polyacetal substance, e.g., a cationic 
polyacetal. In some embodiment, the fusogenic agent can have an alpha helical structure. 
The fusogenic agent can be a membrane disruptive agent, e.g., mellittin. 

A fusogenic agent can have one or more of the following characteristics: (a) is at least 
IDa in size; (b) has at least 10 charged groups, preferably between 10 and 5000 charged 
groups, more preferably between 50 and 1000 charged groups; (c) is present in the complex 
at a ratio of at least 1 : 1 fusogenic agent to carrier agent; (d) is present in the complex at a 
ratio of at least 1 : 1 fusogenic agent to condensing agent; (e) is present in the complex at a 
ratio of at least 1:1 fusogenic agent to targeting agent. 

Other suitable fusogenic agents can be tested and identified by a skilled artisan. The 
ability of a compound to respond to, e.g., change charge depending on, tlie pH enviromnent 
can be tested by routine metliods, e.g., m a cellular assay. For example, a test compound is 
combined or contacted with a cell, and tlie cell is allowed to take up the test compomid, e.g., 
by endocytosis. An endosome preparation can then be made from the contacted cells and the 
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endosome preparation compared to an endosome preparation from control cells. A change, 
e.g., a decrease, in the endosome fraction from the contacted cell vs. the control cell indicates 
that the test compound can function as a fusogenic agent. Alternatively, the contacted cell 
and control cell can be evaluated, e.g,, by microscopy, e.g., by light or electron microscopy, 
to determine a difference in endosome population in the cells. The test compound can be 
labeled. In another type of assay, a modular complex described herein is constructed using 
one or more test or putative fusogenic agents. The modular complex can be constructed 
using a labeled nucleic acid instead of the iRNA. The ability of the fusogenic agent to 
respond to, e.g., change charge depending on, the pH environment, once the modular 
complex is taken up by the cell, can be evaluated, e.g., by preparation of an endosome 
preparation, or by microscopy techniques, as described above. A two-step assay can also be 
performed, wherein a first assay evaluates the ability of a test compound alone to respond to, 
e.g., change charge depending on, the pH environment; and a second assay evaluates the 
ability of a modular complex that includes the test compound to respond to, e.g., change 
charge depending on, the pH environment. 

Condensing agent 

The condensing agent of a modular complex described herein can interact with (e.g., 
attracts, holds, or binds to) an iRNA agent and act to (a) condense, e.g., reduce the size or 
charge of the iRNA agent and/or (b) protect the iRNA agent, e.g., protect the iRNA agent 
against degradation. The condensing agent can include a moiety, e.g., a charged moiety, that 
can interact with a nucleic acid, e.g., an iRNA agent, e.g., by ionic interactions. The 
condensing agent would preferably be a charged polymer, e.g., a polycationic chain. The 
condensing agent can be a polylysine (PLL), spermine, spermidine, polyamine, 
pseudopeptide-polyamine, peptidomimetic polyamine, dendrimer polyamine, arginine, 
amidine, protamine, cationic lipid, cationic porphyrin, quartemary salt of a polyamine, or an 
alpha helical peptide. 

A condensing agent can have the following characteristics: (a) at least IDa in size; (b) 
has at least 2 charged groups, preferably between 2 and 100 charged groups; (c) is present in 
the complex at a ratio of at least 1 : 1 condensing agent to carrier agent; (d) is present in tlie 
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complex at a ratio of at least 1 : 1 condensing agent to fusogenic agent; (e) is present in the 
complex at a ratio of at least 1 : 1 condensing agent to targeting agent. 

Other suitable condensing agents can be tested and identified by a skilled artisan, e.g., 
by evaluating the ability of a test agent to interact with a nucleic acid, e.g., an iRNA agent. 
The ability of a test agent to interact with a nucleic acid^ e,g., an iRNA agent, e.g,, to 
condense or protect the iRNA agent, can be evaluated by routine techniques. In one assay, a 
test agent is contacted with a nucleic acid, and the size and/or charge of the contacted nucleic 
acid is evaluated by a technique suitable to detect changes in molecular mass and/or charge. 
Such techniques include non-denaturing gel electrophoresis, immunological methods, e.g., 
immunoprecipitation, gel filtration, ionic mteraction chromatography, and the like. A test 
agent is identified as a condensing agent if it changes the mass and/or charge (preferably 
both) of the contacted nucleic acid, compared to a control. A two-step assay can also be 
performed, wherein a first assay evaluates the ability of a test compound alone to interact 
with, e.g., bind to, e.g., condense the charge and/or mass of, a nucleic cid; and a second assay 
evaluates the ability of a modular complex that includes the test compound to interact with, 
e.g., bind to, e.g., condense the charge and/or mass of, a nucleic acid. 

Amphipathic Delivery Agents 

In one aspect, the invention features an amphipathic delivery conjugate or module, 
such as those described herein and those described in copending, co-owned United States 
Provisional Application Serial No. 60/455,050 (Attomey Docket No. 14174-065P01), filed 
on March 13, 2003, which is hereby incorporated by reference. 

In addition, the invention include an iRNA agent described herein, e.g., a paUndromic 
iRNA agent, an iRNA agent hving a non canonical pairing, an iRNA agent which targets a 
gene described herem, e.g., a gene active in the liver, an iRNA agent having a chemical 
modification described herein, e.g., a modification which enhances resistance to degradation, 
an iRNA agent having an architecture or structure described herein, an iRNA agent 
administered as described herein, or an iRNA agent formulated as described herein, 
combined with, associated with, and delivered by such an amphipathic delivery conjugate. 

An amphipathic molecule is a molecule having a hydrophobic and a hydrophilic 

region. Such molecules can interact with (e.g., penetrate or disrupt) lipids, e.g., a lipid 
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bylayer of a cell. As such, they can serve as delivery agent for an associated (e.g., bound) 
IRNA (e.g., an iRNA or sRNA described herein). A preferred axnphipathic molecule to be 
used in the compositions described herein (e.g., the amphipathic iRNA constructs descriebd 
herein) is a polymer. The polymer may have a secondary structure, e.g., a repeating 

secondary structure. 

One example of an amphipathic polymer is an amphipathic polypeptide, e.g., a 
polypeptide having a secondary structure such tliat tlie polypeptide has a hydrophilic and a 
hybrophobic face. The design of amphipathic peptide structures (e.g., alpha-helical 
polypeptides) is routine to one of skill in the art. For example, the following references 
provide guidance: Grell et al. (2001) Protein design and folding: template trapping of self- 
assembled helical bundles J Pept Sci 7(3): 146-5 1 ; Chen et al. (2002) Determination of 
stereochemistry stability coefficients of amino acid side-chains in an amphipathic alpha-helix 
J Pept Res 59(l):18-33; Iwata et al. (1994) Design and synthesis of amphipathic 3(10)'helical 
peptides and their interactions with phospholipid bilayers and ion channel formation J Biol 
Chem 269(7):4928-33; Cornut et al. (1994) The amphipathic alpha-helix concept. 
Application to the de novo design of ideally amphipathic Leu, Lys peptides with hemolytic 
activity higher than that ofmelittin FEES Lett 349(l):29-33; Negrete et al. (1998) 
Deciphering the sti^uctural code for proteins: helical propensities in domain classes and 
statistical multiresidue information in alpha-helices. Protein Sci 7(6): 1368-79. 

Another example of an amphipathic polymer is a polymer made up of two or more 
amphipathic subunits, e.g., two or more subunits containing cyclic moieties (e.g., a cyclic 
moiety having one or more hydrophiUc groups and one or more hydrophobic groups). For 
example, the subunit may contain a steroid, e.g., choUc acid; or a aromatic moiety. Such 
moieties preferably can exhibit atropisomerism, such that they can form opposing 
hydrophobic and hydrophilic faces when in a polymer structure. 

The abiUty of a putative amphipatliic molecule to interact with a lipid membrane, e.g., 
a cell membrane, can be tested by routine methods, e.g., in a cell free or cellular assay. For 
example, a test compound is combined or contacted with a syntlietic lipid bilayer, a cellular 
membrane fraction, or a cell, and tlie test compound is evaluated for its ability to interact 
with, penetrate or disrupt the lipid bilayer, cell membrane or cell. The test compound can 
labeled in order to detect the interaction with the lipid bilayer, cell membrane or cell. In 
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another type of assay, the test compound is linked to a reporter molecule or an IRNA agent 
(e.g., an iRNA or sRNA described herein) and the ability of the reporter molecule or iRNA 
agent to penetrate the lipid bilayer, cell membrane or cell is evaluated. A two-step assay can 
also be performed, wherein a first assay evaluates the ability of a test compound alone to 
interact with a lipid bilayer^ cell membrane or cell; and a second assay evaluates the ability of 
a construct (e.g., a construct described herein) that includes the test compound and a reporter 
or iRNA agent to interact with a lipid bilayer, cell membrane or cell. 

An amphipathic polymer usefiil in the compositions described herein has at least 2, 
preferably at least 5, more preferably at least 10, 25, 50, 100, 200, 500, 1000, 2000, 50000 or 
more subunits (e.g., amino acids or cycUc subunits). A single amphipathic polymer can be 
linked to one or more, e.g., 2, 3, 5, 10 or more iRNA agents (e.g., iRNA or sRNA agents 
described herein). In some embodiments, an ampliipathic polymer can contain both amino 
acid and cyclic subunits, e.g., aromatic subunits. 

The invention features a composition that includes an iRNA agent (e.g., an iRNA or 
sRNA described herein) in association with an amphipathic molecule. Such compositions 
may be referred to herein as "amphipathic iRNA constructs." Such compositions and 
constructs are useful in the delivery or targeting of iRNA agents, e.g., delivery or targeting 
of iRNA agents to a cell. While not wanting to be bound by theory, such compositions and 
constructs can increase the porosity of, e.g., can penetrate or disrupt, a lipid (e.g., a lipid 
bilayer of a cell), e.g., to allow entry of the iRNA agent into a cell. 

In one aspect, the invention relates to a composition comprising an iRNA agent (e.g., 
an iRNA or sRNA agent described herein) linked to an amphipathic molecule. The iRNA 
agent and the amphipathic molecule may be held in continuous contact with one another by 
either covalent or noncovalent linkages. 

The amphipathic molecule of the composition or construct is preferably other tlian a 
phospholipid, e.g., other than a micelle, membrane or membrane fragment. 

The amphipathic molecule of the composition or construct is preferably a polymer. 
The polymer may include two or more amphipathic subunits. One or more hydrophilic 
groups and one or more hydrophobic groups may be present on the polymer. The polymer 
may have a repeating secondary structure as well as a first face and a second face. The 
distribution of the hydrophilic groups and the hydrophobic groups along the repeating 
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secondary structure can be such that one face of the polymer is a hydropWUc face and the 
other face of the polymer is a hydrophobic face. 

The amphipathic molecule can be a polypeptide, e.g., a polypeptide comprising an 
a-helical conformation as its secondary structure. 

In one embodiment, the amphipathic polymer mcludes one or more subunits 
containing one or more cychc moiety (e.g., a cyclic moiety having one or more hydrophilic 
groups and/or one or more hydrophobic groups). In one embodhnent, the polymer is a 
polymer of cycUc moieties such that the moieties have alternating hydrophobic and 
hydrophihc groups. For example, the subunit may contain a steroid, e.g., cholic acid. In 
another example, the subunit may contain an aromatic moiety. The aromatic moiety may be 
one that can exhibit atropisomerism, e.g., a 2,2'-bis(substituted)-l-r-binaphthyl or a 2,2'- 
bis(substituted) biphenyl. A subunit may mclude an aromatic moiety of Formula (M): 





(M) 

The invention features a composition that mcludes an iRNA agent (e.g., an iRNA or 
sRNA described herein) m association with an amphipathic molecule. Such compositions 
may be referred to herein as "amphipathic iRNA constructs." Such compositions and 
constructs are useful in tlie delivery or targeting of iRNA agents, e.g., delivery or targeting 
of iRNA agents to a cell. While not wanting to be bound by theory, such compositions and 
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constructs can increase the porosity of, e.g., can penetrate or disrupt, a lipid (e.g., a lipid 
bilayer of a cell), e.g., to allow entry of the iRNA agent into a cell. 

In one aspect, the invention relates to a composition comprising an iRNA agent (e.g., 
an iRNA or sRNA agent described herein) linked to an amphipathic molecule. The iRNA 
agent and the amphipathic molecule may be held in continuous contact with one another by 
either covalent or noncovalent linkages. 

The amphipathic molecule of the composition or construct is preferably other than a 
phospholipid, e.g., other than a micelle, membrane or membrane fragment. 

The amphipathic molecule of tlie composition or construct is preferably a polymer. 
The polymer may include two or more amphipathic subunits. One or more hydrophilic 
groups and one or more hydrophobic groups may be present on die polymer. The polymer 
may have a repeating secondary structure as well as a first face and a second face. The 
distribution of the hydrophilic groups and the hydrophobic groups along the repeating 
secondary structure can be such that one face of the polymer is a hydrophilic face and the 
other face of the polymer is a hydrophobic face. 

The amphipathic molecule can be a polypeptide, e.g., a polypeptide comprising an 

a-helical conformation as its secondary structure. 

In one embodiment, the amphipathic polymer includes one or more subunits 
containing one or more cycHc moiety (e.g., a cyclic moiety having one or more hydrophilic 
groups and/or one or more hydrophobic groups). In one embodiment, the poljoner is a 
polymer of cyclic moieties such that the moieties have altemating hydrophobic and 
hydrophilic groups. For example, the subunit may contain a steroid, e.g., cholic acid. In 
another example, the subunit may contain an aromatic moiety. The aromatic moiety may be 
one that can exhibit atropisomerism, e.g., a 2,2'-bis(substituted)-l-r-binaphthyl or a 2,2'- 
bis(substituted) biphenyl. A subunit may include an aromatic moiety of Formula (M): 



168 



wo 2004/080406 



PCT/US2004/007070 



It remt^ »> 




(M) 

5 

Referring to Formula M, Ri is Ci-Cioo alkyl optionally substituted with aryl, alkenyl, 
alkynyl, alkoxy or halo and/or optionally inserted with O, S, alkenyl or alkynyl; Ci-Cioo 
perfluoroalkyl; or OR5. 

R2 is hydroxy; nitro; sulfate; phosphate; phosphate ester; sulfonic acid; ORe; or Cr 
10 Cioo alkyl optionally substituted with hydroxy, halo, nitro, aryl or alkyl sulfmyl, aryl or alkyl 
sulfonyl, sulfate, sulfonic acid, phosphate, phosphate ester, substituted or unsubstituted aryl, 
carboxyl, carboxylate, amino carbonyl, or alkoxycarbonyl, and/or optionally inserted with O, 
NH, S, S(0), SO2, alkenyl, or alkynyl. 

R3 is hydrogen, or when taken together with R4 froms a fused phenyl ring. 
15 R4 is hydrogen, or when taken together with R3 firoms a fused phenyl ring. 

R5 is Ci-Cioo alkyl optionally substituted with aryl, alkenyl, alkynyl, alkoxy or halo 
and/or optionally inserted with O, S, alkenyl or alkynyl; or Ci-Cioo perfluoroalkyl; and is 
Ci-Cioo alkyl optionally substituted with hydroxy, halo, nitro, aryl or alkyl sulfinyl, aryl or 
alkyl sulfonyl, sulfate, sulfonic acid, phosphate, phosphate ester, substituted or unsubstituted 
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aryl, carboxyl, carboxylate, amino carbonyl, or alkoxycarbonyl, and/or optionally inserted 
with O, NH, S, S(0), SO2, alkenyl, or alkynyl. 

Increasing cellular uptake of dsRNAs 

A method of the invention that can include the administration of an iRNA agent and a 
drug that affects tlie uptake of the iRNA agent into the celL The drug can be administered 
before, after, or at the same time that the iRNA agent is administered. The drug can be 
covalently linked to the iRNA agent. The drug can be, for example, a lipopolysaccharide, an 
activator of p38 MAP kinase, or an activator of NF-kB. The drug can have a transient effect 
on the cell. 

The drug can increase the uptake of the iRNA agent into the cell, for example, by 
disrupting the cell's cytoskeleton, e.g., by disrupting the cell's microtubules, microfilaments, 
and/or intermediate filaments. The drug can be, for example, taxon, vincristine, vinblastine, 
cytochalasin, nocodazole, japlakinolide, latrunculin A, phalloidin, swinholide A, indanocine, 
or myoservin. 

The drug can also increase the uptake of the iRNA agent into the cell by activating an 
inflammatory response, for example. Exemplary drug's that v^ould have such an effect 
include tumor necrosis factor alpha (TNFalpha), interleukin- 1 beta, or gamma interferon. 

iRNA conjugates 

An iRNA agent can be coupled, e.g., covalently coupled, to a second agent. For 
example, an iRNA agent used to treat a particular disorder can be coupled to a second 
therapeutic agent, e,g,, an agent other than the iRNA agent. Tlie second therapeutic agent 
can be one which is directed to the treatment of the same disorder. For example, in the case 
of an iRNA used to treat a disorder characterized by unwanted cell proliferation, e.g., cancer, 
the iRNA agent can be coupled to a second agent which has an anti-cancer effect. For 
example, it can be coupled to an agent wliich stimulates the unmune system, e.g., a CpG 
motif, or more generally an agent that activates a toll-like receptor and/or increases the 
production of gamma interferon. 
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iRNA Production 

An iRNA can be produced, e.g,, in bulk, by a variety of methods. Exemplary 
methods include: organic synthesis and RNA cleavage, e.g.^ in vitro cleavage. 

Organic Synthesis 

An iRNA can be made by separately synthesizing each respective strand of a double- 
stranded RNA molecule. The component strands can then be annealed. 

A large bioreactor, e.g., the OligoPilot II from Pharmacia Biotec AB (Uppsala 
Sweden), can be used to produce a large amount of a particular RNA strand for a given 
iRNA. The OligoPilotll reactor can efficiently couple a nucleotide using only a 1.5 molar 
excess of a phosphoramidite nucleotide. To make an RNA sti'and, ribonucleotides amidites 
are used. Standard cycles of monomer addition can be used to synthesize the 21 to 23 
nucleotide strand for the iRNA. Typically, the two complementary strands are produced 
separately and then annealed, e.g., after release from the solid support and deprotection. 

Organic synthesis can be used to produce a discrete iRNA species. The 
complementary of the species to a particular target gene can be precisely specified. For 
example, the species may be complementary to a region that includes a polymorphism, e.g., a 
single nucleotide polymorphism. Further the location of the polymoiphism can be precisely 
defined. In some embodiments, the polymorphism is located in an internal region, e.g., at 
least 4, 5, 7, or 9 nucleotides from one or both of the termini. 

dsRNA Cleavage 

iRNAs can also be made by cleaving a larger ds iRNA. The cleavage can be 
mediated in vitro or in vivo. For example, to produce iRNAs by cleavage in vitro, the 
following method can be used; 

In vitro transcription. dsRNA is produced by transcribing a nucleic acid (DNA) 
segment in both directions. For example, the HiScribe'^^ RNAi transcription kit (New 
England Biolabs) provides a vector and a method for producing a dsRNA for a nucleic acid 
segment that is cloned into the vector at a position flanked on either side by a T7 promoter. 
Separate templates are generated for T7 transcription of the two complementary strands for 
the dsRNA. The templates are transcribed in vitro by addition of T7 RNA polymerase and 
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dsRNA is produced. Similar methods using PGR aiid/or otlier RNA polymerases (e.g., T3 or 
SP6 polymerase) can also be used. In one embodiment, RNA generated by this method is 
carefully purified to remove endotoxins that may contaminate preparations of the 
recombinant enzymes. 

In vitro cleavage. dsRNAis cleaved in vitro into iRNAs, for example, using a Dicer 
or comparable RNAse Ill-based activity. For example, the dsRNA can be incubated in an in 
vitro exti'act from Drosophila or using purified components, e.g. a purified RNAse or RISC 
complex (RNA-induced silencing complex ). See, e.g., Ketting et al Genes Dev 2001 Oct 
15;15(20):2654-9. and Hammond 2001 Aug 10;293(5532): 1146-50. 

dsRNA cleavage generally produces a plurality of iRNA species, each being a 
particular 21 to 23 nt fragment of a source dsRN A molecule. For example, iRNAs that 
include sequences complementary to overlapping regions and adjacent regions of a source 
dsRNA molecule may be present. 

Regardless of the method of synthesis, the iRNA preparation can be prepared in a 
solution (e.g., an aqueous and/or organic solution) that is appropriate for formulation. For 
example, the iRNA preparation can be precipitated and redissolved in pure double-distilled 
water, and lyophilized. The dried iRNA can then be resuspended in a solution appropriate for 
the intended formulation process. 

Synthesis of modified and nucleotide surrogate iRNA agents is discussed below, 

FORMULATION 

The iRNA agents described herein can be formulated for administration to a subject 

For ease of exposition the formulations, compositions and methods in this section are 
discussed largely with regard to unmodified iRNA agents. It should be understood, however, 
that these fonnulations, compositions and methods can be practiced with other iRNA agents, 
e.g-., modified iRNA agents, and such practice is within the invention. 

A formulated iRNA composition can assmne a variety of states. In some examples, 
the composition is at least partially crystalline, uniformly crystalline, and/or anhydrous {e.g., 
less than 80, 50, 30, 20, or 10% water). In another example, the iRNA is in an aqueous 
phase, e.g., in a solution that includes water. 

The aqueous phase or the crystalline compositions can, e.g., be incorporated into a 

delivery vehicle, e.g., a liposome (particularly for the aqueous phase) or a particle {e.g., a 
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microparticle as can be appropriate for a crystalline composition). Generally, the iRNA 
composition is formulated in a manner that is compatible with the intended method of 
administration (see, below). 

In particular embodiments, the composition is prepared by at least one of the 
following methods: spray drying, lyophilization, vacuum dryings evaporation, fluid bed 
drying, or a combination of these techniques; or sonication with a lipid, freeze-drying, 
condensation and other self-assembly. 

A iRNA preparation can be formulated in combination with another agent, e.g., 
another therapeutic agent or an agent that stabilizes a iRNA, e.g. 9 a protein that complexes 
with iRNA to form an iRNP, Still other agents include chelators, e.g., EDTA (e.g., to 
remove divalent cations such as Mg^"^), salts, RNAse inhibitors (e.g., a broad specificity 
RNAse inhibitor such as RNAsin) and so forth. 

In one embodiment, the iRNA preparation includes another iRNA agent, e.g., a 
second iRNA that can mediated RNAi with respect to a second gene, or with respect to the 
same gene. Still other preparation can include at least 3, 5, ten, twenty, fifty, or a hundred or 
more different iRNA species. Such iRNAs can mediated RNAi with respect to a similar 
number of different genes. 

In one embodiment, the iRNA preparation includes at least a second therapeutic agent 
(e.g., an agent other than an RNA or a DNA). For example, a iRNA composition for the 
treatment of a viral disease, e.g. HIV, might include a known antiviral agent (e.g., a protease 
inliibitor or reverse transcriptase inhibitor). In another example, a iRNA composition for the 
treatment of a cancer might further comprise a chemotherapeutic agent. 

Exemplary formulations are discussed below: 

Liposomes 

For ease of exposition the formulations, compositions and methods in this section are 

discussed largely with regard to unmodified iRNA agents. It should be understood, however, 

that these formulations, compositions and methods can be practiced with otlier iRNA agents, 

e.g., modified iRNA s agents, and such practice is within the invention. An iRNA agent, 

e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA 

agent which can be processed into a sRNA agent, or a DNA which encodes an iRNA agent, 

e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) preparation can be 
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formulated for delivery in a membranous molecular assembly, e.g., a liposome or a micelle. 
As used herein, the term "liposome" refers to a vesicle composed of amphiphilic lipids 
arranged in at least one bilayer, e.g., one bilayer or a plurality of bilayers. Liposomes include 
unilamellar and multilamellar vesicles that have a membrane formed from a lipophilic 

5 material and an aqueous interior. The aqueous portion contains the IRNA composition. The 
lipophilic material isolates the aqueous interior from an aqueous exterior, which typically 
does not include the iRNA composition, although in some examples, it may. Liposomes are 
useful for the transfer and delivery of active ingredients to the site of action. Because the 
liposomal membrane is structurally similar to biological membranes, when Uposomes are 

10 applied to a tissue, the liposomal bilayer fuses with bilayer of the cellular membranes. As the 
merging of the hposome and cell progresses, the intemal aqueous contents that include the 
iRNA are delivered into the cell where the iRNA can specifically bind to a target RNA and 
can mediate RNAi. In some cases the liposomes are also specifically tai*geted, ^ g-, to direct 
the iRNA to particular cell types. 

15 A liposome containing a iRNA can be prepared by a variety of methods. 

In one example, the lipid component of a liposome is dissolved in a detergent so that 
micelles are formed with the lipid component. For example, the lipid component can be an 
amphipatliic cationic Upid or lipid conjugate. The detergent can have a high critical micelle 
concentration and may be nonionic. Exemplary detergents include cholate, CHAPS, 

20 octylglucoside, deoxycholate, and lauroyl sarcosine. The iRNA preparation is then added to 
the micelles that include the lipid component. The cationic groups on the lipid interact with 
the iRNA and condense around the iRNA to form a liposome. After condensation, the 
detergent is removed, e.g. , by dialysis, to yield a liposomal preparation of iRNA. 

If necessary a carrier compound that assists in condensation can be added during the 

25 condensation reaction, e.g., by controlled addition. For example, the carrier compound can 
be a polymer other than a nucleic acid (e.g., spermine or spermidine). pH can also adjusted 
to favor condensation. 

Further description of methods for producing stable polynucleotide delivery vehicles, 
which incorporate a polynucleotide/cationic lipid complex as structural components of the 

30 delivery vehicle, are described in, e.g., WO 96/37194. Liposome formation can also include 
one or more aspects of exemplary methods described in Feigner, P. L. a/., Proc, Natl 
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Acad, Set, USA 8:7413-7417, 1987; U.S. Pat. No. 4,897,355; U.S. Pat. No. 5,171,678; 
Bangliam, et al. M, Mol. Biol 23:238, 1965; Olson, et aL Biochim, Biophys. Acta 557:9, 
1979; Szoka, et al Proc, Natl Acad. Set 75: 4194, 1978; Mayhew, et al Biochim. Biophys. 
Acta 775:169, 1984; Kim, et al Biochim, Biophys. Acta 728:339, 1983; and Fukunaga, et al 
5 Endocrinol, 1 15:757, 1984. Commonly used techniques for preparing lipid aggregates of 
appropriate size for use as delivery vehicles mclude sonication and freeze-thaw plus 
extrusion (see, e.g., Mayer, et aL Biochim. Biophys. Acta 858:161, 1986). Microfluidization 
can be used when consistently small (50 to 200 nm) and relatively uniform aggregates are 
desired (Mayliew, et aL Biochim. Biophys. Acta 775:169, 1984). These methods are readily 

10 adapted to packaging iRNA preparations into liposomes. 

Liposomes that are pH-sensitive or negatively-charged, entrap nucleic acid molecules 
rather than complex with them. Since both the nucleic acid molecules and the lipid are 
similarly charged, repulsion rather than complex formation occurs. Nevertheless, some 
nucleic acid molecules are entrapped within the aqueous interior of these liposomes. pH- 

15 sensitive liposomes have been used to deliver DNA encoding the thymidine kinase gene to 
cell monolayers in culture. Expression of the exogenous gene was detected in the target cells 
(Zhou et aL, Journal of Controlled Release, 19, (1992) 269-274). 

One major type of liposomal composition includes phospholipids other than 
naturally-derived phosphatidylcholine. Neutral liposome compositions, for example, can be 

20 formed from dimyristoyl phosphatidylcholine (DMPC) or dipalmitoyl phosphatidylcholme 
(DPPC). Anionic liposome compositions generally are formed from dimyristoyl 
phosphatidylglycerol, while anionic fiasogenic liposomes are formed primarily from dioleoyl 
phosphatidylethanolamine (DOPE). Another type of liposomal composition is formed from 
phosphatidylcholine (PC) such as, for example, soybean PC, and egg PC. Another type is 

25 formed from mixtures of phospholipid and/or phosphatidylcholine and/or cholesterol. 

Examples of other methods to introduce liposomes into cells in vitro and in vivo 
include U.S. Pat No. 5,283,185; U.S. Pat. No. 5,171,678; WO 94/00569; WO 93/24640; WO 
91/16024; Feigner, J. BioL Chem. 269:2550, 1994; Nabel, Proc. Natl Acad. Sci. 90:11307, 
1993; Nabel, Human Gene Ther. 3:649, 1992; Gershon, Biochem. 32:7143, 1993; and Strauss 

30 EMBOJ. 11:417, 1992. 
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In one embodiment, cationic liposomes are used, Cationic liposomes possess the 
advantage of being able to fuse to the cell membrane. Non-cationic liposomes, although not 
able to fuse as efficiently with the plasma membrane, are taken up by macrophages in vivo 
and can be used to deliver iRNAs to macrophages. 

Further advantages of liposomes include: liposomes obtained from natural 
phospholipids are biocompatible and biodegradable; liposomes can incorporate a v/ide range 

o 

of water and lipid soluble drugs; liposomes can protect encapsulated iRNAs in their intemal 
compartments from metabolism and degradation (Rosoff, in "Pharmaceutical Dosage 
Forms,*' Lieberman, Rieger and Banlcer (Eds.), 1988, volume 1, p. 245). hnportant 
considerations in the preparation of liposome formulations are the lipid surface charge, 
vesicle size and the aqueous volume of the liposomes. 

A positively charged synthetic cationic lipid, N-[l-(2,3"dioleyloxy)propyl]-N,N,N- 
trimethylammoniimi chloride (DOTMA) can be used to form small liposomes that interact 
spontaneously with nucleic acid to form lipid-nucleic acid complexes which are capable of 
fusing with the negatively charged lipids of the cell membranes of tissue culture cells, 
resulting in delivery of iRNA (see, e.g.. Feigner, P. L. et aL, Proc. Natl. Acad. Sci., USA 
8:7413-7417, 1987 and U.S. Pat. No. 4,897,355 for a description of DOTMA and its use with 
DNA). 

A DOTMA analogue, l,2-bis(oleoyloxy)-3-(trimethylammonia)propane (DOTAP) 
can be used in combination with a phospholipid to form DNA-complexing vesicles. 
LipofectinTM Bethesda Research Laboratories, Gaithersburg, Md.) is an effective agent for 
the delivery of highly anionic nucleic acids into living tissue culture cells that comprise 
positively charged DOTMA liposomes which interact spontaneously with negatively charged 
polynucleotides to form complexes. When enough positively charged liposomes are used, 
the net charge on the resulting complexes is also positive. Positively charged complexes 
prepared in this way spontaneously attach to negatively charged cell surfaces, fuse with the 
plasma membrane, and efficiently deliver functional nucleic acids into, for example, tissue 
culture cells. Another commercially available cationic lipid, l,2-bis(oleoyloxy)-3,3- 
(trimethylammonia)propane ("DOTAP") (Boehringer Mannheim, Indianapolis, Indiana) 
differs from DOTMA m tliat the oleoyl moieties are luiked by ester, rather than ether 
linkages. 
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Other reported cationic lipid compounds include those that have been conjugated to a 
variety of moieties including, for example, carboxyspermine which has been conjugated to 
one of two types of lipids and includes compounds such as 5-carboxyspermylglycine 
dioctaoleoylamide ("DOGS") (Transfectam™^ Promega, Madison, Wisconsin) and 

dipalmitoylphosphatidylethanolamine 5-carboxyspermyl-amide ("DPPES") (see^ e.g., U.S. 
Pat. No. 5,171,678). 

Another cationic lipid conjugate includes derivatization of the lipid with cholesterol 
("DC-Chol") which has been formulated into liposomes in combination with DOPE (See, 
Gao, X. and Huang, L., Biochim. Biophys. Res. Commun. 179:280, 1991). Lipopolylysine, 
made by conjugating polylysine to DOPE, has been reported to be effective for transfection 
in the presence of serum (Zhou, X. et aL, Biochim. Biophys. Acta 1065:8, 1991). For certain 
cell lines, these liposomes containing conjugated cationic lipids, are said to exhibit lower 
toxicity and provide more efficient transfection than the DOTMA-containing compositions. 
Other commercially available cationic lipid products include DMRIE and DMRIE-HP 
(Vical, La JoUa, California) and Lipofectamine (DOSPA) (Life Technology, Inc., 
Gaithersburg, Maryland). Other cationic lipids suitable for the delivery of oligonucleotides 
are described in WO 98/39359 and WO 96/37194. 

Liposomal formulations are particularly suited for topical administration, liposomes 
present several advantages over other formulations. Such advantages include reduced side 
effects related to high systemic absorption of the administered drug, increased accimiulation 
of the administered drug at the desired target, and the ability to administer iRNA, into the 
skin. In some implementations, liposomes are used for delivering iRNA to epidermal cells 
and also to enhance the penetration of iRNA into dermal tissues, e.g., into skin. For example, 
the liposomes can be applied topically. Topical delivery of drugs formulated as liposomes to 
the skin has been documented (see, e.g., Weiner et al. Journal of Drug Targeting, 1992, vol. 
2,405-410 and du Plessis et al. Antiviral Research, 18, 1992, 259-265; Mannino, R. J. and 
Fould-Fogerite, S., Biotechniques 6:682-690, 1988; Itani, T. et al Gene 56:267-276. 1987; 
Nicolau, C. etaL Meth. Enz. 149:157-176, 1987; Straubinger, R. M. and Papahadjopoulos, 
D. Meth. Enz. 101:512-527, 1983; Wang, C. Y. and Huang, L., Proc. Natl. Acad. Sci. USA 
84:7851-7855, 1987). 
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Non-ionic liposomal systems have also been examined to determine their utility in the 
delivery of drugs to the skin, in particular systems comprising non-ionic surfactant and 
cholesterol. Non-ionic liposomal formulations comprising Novasome I (glyceryl 
dilaurate/cholesterol/polyoxyethylene-lO-stearyl ether) and Novasome II (glyceryl distearate/ 
cholesteroI/polyoxyethylene-lO-stearyl ether) were used to deliver a di-ug into the dermis of 
mouse skin. Such formulations with iRNA are useful for treating a dermatological disorder. 

Liposomes that include iRNA can be made highly deformable. Such deformability 
can enable the liposomes to penetrate through pore that are smaller than the average radius of 
the liposome. For example, transfersomes are a type of deformable liposomes. 
Transferosomes can be made by adding surface edge activators, usually surfactants, to a 
standard liposomal composition. Transfersomes that include iRNA can be delivered, for 
example, subcutaneously by infection in order to deliver iRNA to keratinocytes in the skin. 
In order to cross intact mammalian skin, lipid vesicles must pass through a series of fine 
pores, each with a diameter less than 50 nm, xmder the influence of a suitable transdermal 
gradient. In addition, due to the lipid properties, these transferosomes can be self-optimizing 
(adaptive to the shape of pores, e.g., in the skin), self-repairing, and can frequently reach 
their targets without fragmenting, and often self-loading. The iRNA agents can include an 
RRMS tethered to a moiety which improves association with a liposome. 

Surfactants 

For ease of exposition the formulations, compositions and methods in this section are 
discussed largely with regard to unmodified iRNA agents. It should be understood, however, 
that these formulations, compositions and methods can be practiced with other iRNA agents, 
e.g-., modified iRNA agents, and such practice is witliin the invention. Surfactants find wide 
application in formulations such as emulsions (including microemulsions) and liposomes (see 
above). iRNA (or a precui'sor, e.g., a larger dsRNA which can be processed into a iRNA, or 
a DNA which encodes a iRNA or precursor) compositions can include a surfactant. In one 
embodiment, the iRNA is formulated as an emulsion that includes a surfactant. The most 
common way of classifying and ranking the properties of the many different types of 
surfactants, both natural and synthetic, is by the use of the hydrophile/lipophile balance 
(HLB). The natui'c of the hydrophilic group provides the most useful means for categorizing 
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the different siirfactants used in formulations (Rieger, in "Pharmaceutical Dosage Forms," 
Marcel Dekker, Inc., New York, NY, 1988, p. 285). 

If the surfactant molecule is not ionized, it is classified as a nonionic surfactemt. 
Nonionic surfactants find wide application in pharmaceutic£il products and are usable over a 
5 wide range of pH values. In general their HLB values range from 2 to about 1 8 depending 
on their structure. Nonionic surfactants include nonionic esters such as ethylene glycol 
esters, propylene glycol esters, glyceryl esters, polyglyceryl esters, sorbitan esters, sucrose 
esters, and ethoxylated esters. Nonionic alkanolamides and ethers such as fatty alcohol 
ethoxylates, propoxylated alcohols, and ethoxylated/propoxylated block polymers are also 
10 included in this class. The polyoxyethylene surfactants are the most popular* members of the 
nonionic surfactant class. 

If the sxirfactant molecule carries a negative charge when it is dissolved or dispersed 
in water, the surfactant is classified as anionic. Anionic surfactants include carboxylates 
such as soaps, acyl lactylates, acyl amides of amino acids, esters of sulfuric acid such as alkyl 
15 sulfates and ethoxylated alkyl sulfates, sulfonates such as alkyl benzene sulfonates, acyl 

isethionates, acyl taurates and siolfosuccinates, and phosphates. The most important members 
of the anionic surfactant class are the alkyl sulfates and the soaps. 

If the surfactant molecule carries a positive charge when it is dissolved or dispersed in 
water, the surfactant is classified as cationic. Cationic surfactants include quaternary 
20 ammonium salts and ethoxylated amines. The quaternary ammonium salts are the most used 
members of this class. 

If the surfactant molecule has the ability to carry either a positive or negative charge, 
the surfactant is classified as amphoteric. Amphoteric surfactants include acrylic acid 
derivatives, substituted alkylamides, N-alkylbetaines and phosphatides. 
25 The use of surfactants in drug products, formulations and in emulsions has been 

reviewed (Rieger, in "Pharmaceutical Dosage Forms," Marcel Dekker, Inc., New York, NY, 
1988, p. 285). 

Micelles and other Membranous Formulations 

For ease of exposition the micelles and other formulations, compositions and methods 
30 in this section are discussed largely with regard to unmodified iRNA agents. It should be 
understood, however, that these micelles and other formulations, compositions and methods 
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can be practiced with other iRNA agents, e.g., modified iRNA agents, and such practice is 
within the invention. The iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
(e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof)) composition can be provided as a micellar formulation. ''Micelles" are 
defined herein as a particular type of molecular assembly in which amphipathic molecules 
are arranged in a spherical structure such that all the hydrophobic portions of the molecules 
are directed inward, leaving the hydrophilic portions in contact with the surrounding aqueous 
phase. The converse arrangement exists if the environment is hydrophobic. 

A mixed micellar formulation suitable for delivery through transdermal membranes 
may be prepared by mixing an aqueous solution of the iRNA composition, an alkali metal Cg 
to C22 alkyl sulphate, and a micelle forming compounds. Exemplary micelle forming 
compoimds include lecithin, hyalvironic acid, pharmaceutically acceptable salts of hyaluronic 
acid, glycolic acid, lactic acid, chamomile extract, cucumber extract, oleic acid, linoleic acid, 
linolenic acid, monoolein, monooleates, monolaiarates, borage oil, evening of primrose oil, 
menthol, trihydroxy 0x0 cholanyl glycine and pharmaceutically acceptable salts thereof, 
glycerin, polyglycerin, lysme, polylysine, triolein, polyoxyethylene ethers and analogues 
thereof, poUdocanol alkyl ethers and analogues thereof, chenodeoxycholate, deoxycholate, 
and mixtures thereof. The micelle forming compounds may be added at the same time or 
after addition of the alkali metal alkyl sulphate. Mixed micelles will form with substantially 
any kmd of mixing of the ingredients but vigorous mixing is preferred in order to provide 
smaller size micelles. 

In one method a first micellar composition is prepared which contains the iRNA 
composition and at least the alkali metal alkyl sulphate. The first micellar composition is then 
mixed with at least three micelle forming compounds to form a mixed micellar composition. 
In another method, the micellar composition is prepared by mixing the iRNA composition, 
the alkali metal alkyl sulphate and at least one of the micelle forming compoxmds, followed 
by addition of the remaining micelle formmg compomids, with vigorous mixing. 

Phenol and/or m-cresol may be added to the mixed micellar composition to stabilize 
the formulation and protect against bacterial growth. Altematively, phenol and/or m-cresol 
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may be added with the micelle forming ingredients. An isotonic agent such as glycerin may 
also be added after formation of the mixed micellar composition. 

For delivery of the micellar formulation as a spray, the formulation can be put into an 
aerosol dispenser and the dispenser is charged with a propellant. The propellant, which is 
under pressure, is in liquid form in the dispenser. The ratios of the ingredients are adjusted 
so that the aqueous and propellant phases become one, i,e. there is one phase. If there are 
two phases, it is necessary to shake the dispenser prior to dispensing a portion of the 
contents, e,g. through a metered valve. The dispensed dose of pharmaceutical agent is 
propelled from the metered valve in a fine spray. 

The preferred propellants are hydrogen-containing chlorofLuorocarbons, hydrogen- 
containing fluorocarbons, dimetliyl ether and diethyl ether. Even more preferred is HFA 134a 
( 1 , 1 , 1 ,2 tetrafluoroethane) . 

Tlie specific concentrations of the essential ingredients can be determined by 
relatively straightforward experimentation. For absorption through the oral cavities, it is 
often desirable to increase, e.g, at least double or triple, the dosage for through injection or 
administration through the gastrointestinal tract. 

The iRNA agents can include an RRMS tethered to a moiety which improves 
association with a micelle or other membranous formulation. 

Particles 

For ease of exposition the particles, formulations, compositions and methods in this 
section are discussed largely with regard to unmodified iRNA agents. It should be 
understood, however, that these particles, formulations, compositions and methods can be 
practiced witli otlier iRNA agents, e.g., modified iRNA agents, and such practice is within 
the invention. In another embodiment, an iRNA agent, e.g., a double-stranded iRNA agent, 
or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can be processed into a 
sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, 
or sRNA agent, or precursor thereof) preparations may be incorporated into a particle, e.g., a 
microparticle. Microparticles can be produced by spray-drying, but may also be produced by 
other methods including lyopliiUzation, evaporation, fluid bed drymg, vacuum drying, or a 
combination of these techniques. See below for further description. 
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Sustained -Release Formulations. An IRNA agent, e.g., a double-stranded iRNA 
agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can be processed 
into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double-stranded iRNA 
agent, or sRNA agent, or precursor thereof) described herein can be formulated for 
controlled, e.g., slow release. Controlled release can be achieved by disposing the iRNA 
within a structure or substance which impedes its release. E.g., iRNA can be disposed within 
a porous matrix or in an erodable matrix, either of which allow release of the iRNA over a 
period of time. 

Polymeric particles, e.g., polymeric in microparticles can be used as a sustained- 
release reservon of iRNA that is taken up by cells only released from the microparticle 
through biodegradation. The polymeric particles in this embodiment should therefore be 
large enough to preclude phagocytosis (e.g., larger than 10 |xm and preferably larger than 20 
[im). Such particles can be produced by the same methods to make smaller particles, but with 
less vigorous mixing of the first and second emulsions. That is to say, a lower 
homogenization speed, vortex mixing speed, or sonication setting can be used to obtain 
particles having a diameter around 100 pm rather than 10 \xm. The time of mixing also can be 
altered. 

Larger microparticles can be formulated as a suspension, a powder, or an implantable 
solid, to be delivered by intramuscular, subcutaneous, intradermal, intravenous, or 
intraperitoneal injection; via inhalation (intranasal or intrapulmonary); orally; or by 
implantation. These particles are useful for delivery of any iRNA when slow release over a 
relatively long term is desired. The rate of degradation, and consequently of release, varies 
with the polymeric formulation. 

Microparticles preferably include pores, voids, hollows, defects or other interstitial 
spaces that allow the fluid suspension medium to freely permeate or perfuse the particulate 
boundary. For example, the perforated microstructures can be used to form hollow, porous 
spray dried microspheres. 

Polymeric particles containing iRNA (e.g., a sRNA) can be made using a double 
emulsion technique, for instance. First, the polymer is dissolved in an organic solvent. A 
preferred polymer is polylactic-co-glycolic acid (PLGA), with a lactic/glycolic acid weight 
ratio of 65:35, 50:50, or 75:25. Next, a sample of nucleic acid suspended in aqueous solution 
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is added to the polymer solution and the two solutions are mixed to form a first emulsion. 
The solutions can be mixed by vortexing or shaking, and in a preferred method, the mixture 
can be sonicated. Most preferable is any method by which the nucleic acid receives the least 
amount of damage in the form of nicking, shearing, or degradation, while still allowing the 
formation of an appropriate emulsion. For example, acceptable results can be obtained with a 
Vibra-cell model VC-250 sonicator with a 1/8" microtip probe, at setting #3, 

Spray-Drying. An iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
(e.g., a precursor, e,g,^ a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precuirsor thereof)) can be prepared by spray drying. Spray dried iRNA can be administered 
to a subject or be subjected to further formulation. A pharmaceutical composition of iRNA 
can be prepared by spray drying a homogeneous aqueous mixture that includes a iRNA under 
conditions sufficient to provide a dispersible powdered composition, e.g., a pharmaceutical 
composition. The material for spray drying can also include one or more of: a 
pharmaceutically acceptable excipient, or a dispersibility-enhancing amount of a 
physiologically acceptable, water-soluble protein. The spray-dried product can be a 
dispersible powder that includes the iRNA. 

Spray drying is a process that converts a liquid or slurry material to a dried particulate 
form. Spray drying can be used to provide powdered material for various administrative 
routes including inhalation. See, for example, M. Sacchetti and M. M. Van Oort in: 
Inhalation Aerosols: Physical and Biological Basis for Therapy, A. J. Hickey, ed. Marcel 
Dekkar, New York, 1996. 

Spray drying can include atomizing a solution, emulsion, or suspension to form a fine 
mist of droplets and drying the droplets. The mist can be projected into a drying chamber 
(e.g., a vessel, tank, tubing, or coil) where it contacts a drying gas. The mist can include 
solid or liquid pore forming agents. The solvent and pore forming agents evaporate from the 
droplets into the drying gas to solidify the droplets, simultaneously forming pores throughout 
the solid. The solid (typically in a powder, particulate form) then is separated from the drying 
gas and collected. 

Spray drying includes bringing together a highly dispersed liquid, and a sufficient 
volume of air (e.g., hot air) to produce evaporation and drying of the liquid droplets. The 
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preparation to be spray dried can be any solution, course suspension, slurry, colloidal 
dispersion, or paste that may be atomized using the selected spray drying apparatus. 
Typically, the feed is sprayed into a current of warm filtered air that evaporates the solvent 
and conveys the dried product to a collector. The spent air is then exhausted with the solvent. 

Several different types of apparatus may be used to provide the desired product. For example, 
commercial spray dryers manufactured by Buchi Ltd. or Niro Corp. can effectively produce 

pcirticles of desired size. 

Spray-dried powdered particles can be approximately spherical in shape, nearly 
uniform in size and frequently hollow. There may be some degree of irregularity in shape 
depending upon the incorporated medicament and the spray drying conditions. In many 
instances the dispersion stability of spray-dried microspheres appeals to be more effective if 
an inflating agent (or blowing agent) is used in their production. Particularly preferred 
embodiments may comprise an emulsion with an inflating agent as tlie disperse or continuous 
phase (the other phase being aqueous in nature). An inflating agent is preferably dispersed 
with a surfactant solution, using, for instance, a commercially available microfluidizer at a 
pressure of about 5000 to 15,000 psi. This process forms an emulsion, preferably stabilized 
by an incorporated surfactant, typically comprising submicron droplets of water immiscible 
blowing agent dispersed in an aqueous continuous phase. The formation of such dispersions 
using this and otlier techniques are common and well known to those in the ait. The blowing 
agent is preferably a fluorinated compound (e.g. perfluorohexane, perfluorooctyl bromide, 
perfluorodecalin, perfluorobutyl ethane) which vaporizes during the spray-drying process, 
leaving behind generally hollow, porous aerodynamically light microspheres. As will be 
discussed in more detail below, other suitable blowing agents include chloroform, freons, and 
hydrocarbons. Nitrogen gas and carbon dioxide are also contemplated as a suitable blowing 
agent. 

Although the perforated microstructures are preferably fonned using a blowing agent 
as described above, it will be appreciated that, in some instances, no blowing agent is 
required and an aqueous dispersion of the medicament and surfactant(s) are spray dried 
directly. In such cases, the formulation may be amenable to process conditions (e.g., elevated 
temperatures) that generally lead to the formation of hollow, relatively porous microparticles. 
Moreover, the medicament may possess special physicochemical properties {e.g., high 
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crystallinity, elevated melting temperature, surface activity, etc.) that make it particularly 
suitable for use in such techniques. 

The perforated micro structures may optionally be associated with, or comprise, one 
or more surfactants. Moreover, miscible surfactants may optionally be combmed with the 

suspension medium liquid phase. It will be appreciated by those skilled in the art that the use 
of sxirfactants may further increase dispersion stability, simplify formulation procedures or 
increase bioavailability upon administration. Of course combinations of surfactants, 
including the use of one or more in the liquid phase and one or more associated with the 
perforated microstructures are contemplated as being within the scope of the invention. By 
"associated with or comprise" it is meant that the structural matrix or perforated 
microstructure may incorporate, adsorb, absorb, be coated with or be formed by the 
surfactant. 

Surfactants suitable for use include any compound or composition that aids in the 
formation and maintenance of the stabilized respiratory dispersions by forming a layer at the 
interface between the structural matrix and the suspension medium. The surfactant may 
comprise a single compound or any combination of compoxmds, such as in the case of co- 
surfactants. Particularly preferred surfactants are substantially insoluble in the propellant, 
nonfluorinated, and selected from the group consisting of saturated and unsaturated lipids, 
nonionic detergents, nonionic block copolymers, ionic surfactants, and combinations of such 
agents. It should be emphasized that, in addition to the aforementioned surfactants, suitable 
(Le. biocompatible) fluorinated surfactants are compatible with the teachings herein and may 
be used to provide the desired stabilized preparations. 

Lipids, including phosphoUpids, from both natural and synthetic soxirces may be used 
in varying concentrations to form a structural matrix. Generally, compatible lipids comprise 
tliose that have a gel to liquid crystal phase transition greater than about 40° C. Preferably, 
the incorporated lipids are relatively long chain (i,e. Ce -C22) saturated lipids and more 
preferably comprise phospholipids. Exemplary phospholipids usefal in the disclosed 
stabilized preparations comprise egg phosphatidylcholine, dilauroylphosphatidylcholine, 
dioleylphosphatidylcholine, dipalmitoylphosphatidyl-choline, disteroylphosphatidylcholine, 
short-chain phosphatidylcholines, phosphatidylethanolamine, 
dioleylphosphatidylethanolamme, phosphatidylserine, phosphatidylglycerol, 
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phosphatidylinositol, glycolipids, ganglioside GMl, sphingomyelin, phosphatidic acid, 
cardiolipin; lipids bearing polymer chains such as, polyethylene glycol, chitin, hyalxironic 
acid, or polyvinylpyrrolidone; lipids bearing sulfonated mono-, di-, and polysaccharides; 
fatty acids such as palmitic acid, stearic acid, and oleic acid; cholesterol, cholesterol esters, 
and cholesterol heniisuccinate. Due to their excellent biocompatibility characteristics, 
phospholipids and combinations of phospholipids and poloxamers are particularly suitable 
for use in the stabilized dispersions disclosed herein. 

Compatible nonionic detergents comprise: sorbitan esters including sorbitan trioleate 
(Spans^^ 85), sorbitan sesquioleate, sorbitan monooleate, sorbitan monolaurate, 
polyoxyethylene (20) sorbitan monolaurate, and polyoxyethylene (20) sorbitan monooleate, 
oleyl polyoxyethylene (2) ether, stearyl polyoxyethylene (2) ether, lauryl polyoxyethylene (4) 
ether, glycerol esters, and sucrose esters. Other suitable nonionic detergents can be easily 
identified using McCutcheon's Emulsifiers and Detergents (McPublishing Co., Glen Rock, 
N.J.). Preferred block copolymers include diblock and triblock copolymers of 
polyoxyethylene and polyoxypropylene, including poloxamer 188 (Pluronic.RTM. F68), 
poloxamer 407 (Pluronic.RTM. F-127), and poloxamer 338. Ionic surfactants such as sodium 
sulfosuccinate, and fatty acid soaps may also be utilized. In preferred embodiments, the 
microstructures may comprise oleic acid or its alkali salt. 

In addition to the aforementioned surfactants, cationic smfactants or lipids are 
preferred especially in the case of delivery of an iRNA agent, e.g., a double-stranded iRNA 
agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can be processed 
into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double-stranded iRNA 
agent, or sRNA agent, or precursor thereof). Examples of suitable cationic lipids include: 
DOTMA, N-[-(2,3-dioleyloxy)propyl]-N,N,N-trimethylammonium-chloride; DOTAP, 1 ,2- 
dioleyloxy-3-(trimethylammonio)propane; andDOTB, l,2-dioleyl-3-(4*-- 
trimethylammonio)butanoyl-sn-glyceroL Polycationic amino acids such as polylysine, and 
polyarginine are also contemplated. 

For the spraying process, such spraying methods as rotary atomization, pressure 
atomization and two-fluid atomization can be used. Examples of the devices used in tliese 
processes include "Parubisu [phonetic rendering] Mini-Spray GA-32" and "Parubisu Spray 
Drier DL-4r', manufactured by Yamato Chemical Co., or "Spray Drier CL-8," ^'Spray Drier 
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L-8," "Spray Drier FL-12/' "Spray Drier FL-16" or "Spray Drier FL-20," manufactured by 
Okawara Kakoki Co., can be used for the metliod of spraying using rotary-disk atomizer. 

While no particular restrictions are placed on the gas used to dry the sprayed material, 
it is recommended to use air, nitrogen gas or an inert gas. The temperature of the inlet of the 
gas used to dry the sprayed materials such that it does not cause heat deactivation of the 
sprayed material. The range of temperatures may vary between about 50°C to about lOO'^C^ 
preferably between about 50°C and 100°C. The temperature of the outlet gas used to dry the 
sprayed material, may vary between about 0°C and about 150°C, preferably between 0°C and 
90°C^ and even more preferably between 0°C and 60°C. 

The spray drying is done under conditions that result in substantially amorphous 
powder of homogeneous constitution having a particle size that is respirable, a low moisture 
content and flow characteristics that allow for ready aerosolization. Preferably the particle 
size of the resulting powder is such that more than about 98% of the mass is in particles 
having a diameter of about 10 pm or less with about 90% of the mass being in particles 
having a diameter less than 5 pm. Alternatively, about 95% of the mass will have particles 
with a diameter of less than 10 pm with about 80% of the mass of the particles having a 

diameter of less than 5 pm. 

The dispersible pharmaceutical-based dry powders that include the iRNA preparation 
may optionally be combined with pharmaceutical carriers or excipients which are suitable for 
respiratory and pulmonary administration. Such carriers may serve simply as bulking agents 
when it is desired to reduce the iRNA concentration in the powder which is being delivered 
to a patient, but may also serve to enhance the stability of the lElNA compositions and to 
improve the dispersibility of the powder within a powder dispersion device in order to 
provide more efficient and reproducible delivery of the iRNA and to improve handling 
characteristics of the iRNA such as flowability and consistency to facilitate manufacturing 
and powder filling. 

Such carrier materials may be combined with the drug prior to spray drying, z.e., by 
adding the carrier material to the purified bulk solution. In that way, the carrier particles will 
be fonned simultaneously with the drug particles to produce a homogeneous powder. 
Alternatively, the carriers may be separately prepared in a dry powder form and combined 
with the dry powder drug by blending. The powder carriers v/ill usually be crystalline (to 
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avoid water absorption), but might in some cases be amorphous or mixtures of crystalline 
and amorphous. The size of the carrier particles may be selected to improve the flowability of 
the drug powder, typically being in the range from 25 jim to 100 \im. A preferred carrier 
material is crystalline lactose having a size in the above-stated range. 

Powders prepared by any of the above methods will be collected from the spray dryer 
in a conventional manner for subsequent use. For use as pharmaceuticals and other purposes, 
it will frequently be desirable to disrupt any agglomerates which may have formed by 
screening or other conventional techniques. For pharmaceutical uses, the dry powder 
formulations will usually be measured into a single dose, and the single dose sealed into a 
package. Such packages are particularly usefiil for dispersion in dry powder inhalers, as 
described in detail below. Altematively, the powders may be packaged in multiple-dose 
containers. 

Methods for spray drying hydrophobic and other drugs and components are described 
in U.S. Pat. Nos. 5,000,888; 5,026,550; 4,670,419, 4,540,602; and 4,486,435. Bloch and 
Speison (1983) Pharm. ActaHelv 58:14-22 teaches spray drying of hydrochlorothiazide and 
chlorthalidone (lipophilic drugs) and a hydrophilic adjuvant (pentaerythritol) in azeotropic 
solvents of dioxane-water and 2-ethoxyethanol- water. A number of Japanese Patent 
application Abstracts relate to spray drying of hydrophilic-hydrophobic product 
combinations, including JP 806766; JP 7242568; JP 7101884; JP 7101883; JP 71018982; JP 
7101881; and JP 4036233. Other foreign patent publications relevant to spray drying 
hydrophiUc-hydrophobic product combinations include FR 2594693; DE 2209477; and 
WO 88/07870. 

LYOPHILIZATION . 

An iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a 
precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a DNA 
which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof) preparation can be made by lyophilization, Lyophilization is a freeze- 
drying process in which water is sublimed from the composition after it is frozen. The 
particular advantage associated with the lyophilization process is that biologicals and 
pharmaceuticals that are relatively unstable in an aqueous solution can be dried without 
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elevated temperatures (thereby eliminating the adverse thermal effects), and then stored in a 
dry state where there are few stability problems. With respect to the instant invention such 
techniques are particularly compatible with the incorporation of nucleic acids in perforated 
microstructures without compromising physiological activity. Methods for providing 

5 lyophilized particulates are known to those of skill in the art and it v/ould clearly not require 
undue experimentation to provide dispersion compatible microstructures in accordance with 
the teachings herein. Accordingly, to the extent that lyophilization processes may be used to 
provide microstructures having the desired porosity and size^ they are conformance with the 
teachings herem and are expressly contemplated as being within the scope of the instant 

1 0 invention. 

Targeting 

For ease of exposition the fomiulations, compositions and methods in this section are 
discussed largely with regard to unmodified iRNAs. It should be understood, however, that 
these formulations, compositions and methods can be practiced with other iRNA agents, e.g., 

1 5 modified iRNA agents, and such practice is withm the invention. 

In some embodiments, an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA 
agent, (e.g., a precursor, e.g., a, larger iRNA agent which can be processed into a sRNA 
agent, or a DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or 
sRNA agent, or precursor thereof) is targeted to a particular cell. For example, a liposome 

20 or particle or other structure that includes a iRNA can also include a targeting moiety that 
recognizes a specific molecule on a target cell. The targeting moiety can be a molecule with 
a specific affinity for a target cell. Tai'geting moieties can include antibodies directed against 
a protein found on the surface of a target cell, or the ligand or a receptor-binding portion of a 
ligand for a molecule found on the surface of a target cell. For example, the targeting moiety 

25 can recognize a cancer-specific antigen (e.g., CA15-3, CA19-9, CEA, or HER2/neu.) or a 
viral antigen, tlius delivering the iRNA to a cancer cell or a virus-infected cell. Exemplary 
targeting moieties include antibodies (such as IgM, IgG, IgA, IgD, and the Uke, or a 
functional portions thereof), ligands for cell surface receptors (e.g., ectodomains thereof). 
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Table 3 provides a munber of antigens which can be used to target selected cells. 



Tables. 



ANTIGEN 

CEA (carcinoembryonic antigen) 

PSA (prostate specific antigen) 

CA-125 

CA15-3 

CA19-9 

HER2/neu 

a-feto protein 

p-HCG (human chorionic gonadotropin) 
MUC-1 

Estrogen receptor 

Progesterone receptor 

EGFr (epidermal growth factor receptor) 



Rxempl ary tumor tissue 



colon, breast, lung 
prostate cancer 
ovarian cancer 
breast cancer 
breast cancer 
breast cancer 

testicular cancer, hepatic cancer 
testicular cancer, choriocarcinoma 
breast cancer 

breast cancer, uterine cancer 
breast cancer, uterine cancer 
bladder cancer 



In one embodiment, the targeting moiety is attached to a liposome. For example, US 
6,245,427 describes a method for targeting a liposome using a protein or peptide. In another 
example, a cationic lipid component of the liposome is derivatized with a targeting moiety. 
For example, WO 96/37194 describes converting N-glutaryldioleoylphosphatidyl 
ethanolamine to a N-hydroxysuccinimide activated ester. The product was then coupled to 
an RGD peptide. 

GENES AND DISEASES 

In one aspect, the invention features, a method of treating a subject at risk for or 
afflicted with unwanted cell proliferation, e.g., malignant or normialignant cell proliferation. 

The method includes: 

providing an iRNA agent, ^.g., an sRNA or iRNA agent described herein, e.g., an 
iRNA having a structure described herein, where the iRNA is homologous to and can silence, 
e.g., by cleavage, a gene which promotes unwanted cell proliferation; 

administering an iRNA agent, e.g,, an sRNA or iRNA agent described herein to a 
subject, preferably a human subject, 

thereby treating the subject. 
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In a preferred embodiment the gene is a growth factor or growth factor receptor gene, 
a kinase, e.g., a protein tyrosine, serine or threonine kinase gene, an adaptor protein gene, a 
gene encoding a G protein superfamily molecule, or a gene encoding a transcription factor. 

In a preferred embodiment the iRNA agent silences the PDGF beta gene, and thus can 
be used to treat a subject having or at risk for a disorder characterized by unwanted PDGF 
beta expression, e.g., testicular and lung cancers. 

In another preferred embodiment the iRNA agent silences the Erb-B gene, and thus 
can be used to treat a subject having or at risk for a disorder characterized by unwanted Erb- 
B expression, e.g., breast cancer. 

In a preferred embodiment the iRNA agent silences the Src gene, and thus can be 
used to treat a subject having or at risk for a disorder chai*acterized by unwanted Src 
expression, e.g., colon cancers. 

In a preferred embodiment the iRNA agent silences the CRK gene, and thus can be 
used to treat a subject having or at risk for a disorder characterized by unwanted CRK 
expression, e.g., colon and lung cancers. 

In a preferred embodiment tlie iRNA agent silences the GRB2 gene, and thus can be 
used to treat a subject having or at risk for a disorder characterized by unwanted GRB2 
expression, e.g., squamous cell carcinoma. 

In another preferred embodiment the iRNA agent silences the RAS gene, and thus can 
be used to treat a subject having or at risk for a disorder characterized by unwanted RAS 
expression, e.g., pancreatic, colon and lung cancers, and chronic leukemia. 

In another preferred embodiment the iRNA agent silences the MEKK gene, and thus 
can be used to treat a subject having or at risk for a disorder characterized by unwanted 
MEKK expression, e.g., squamous cell carcinoma, melanoma or leukemia. 

In another preferred embodiment the iRNA agent silences the JNK gene, and thus can 
be used to treat a subject having or at risk for a disorder characterized by unwanted JNK 
expression, e.g., pancreatic or breast cancers. 

In a preferred embodiment the iRNA agent silences the RAF gene, and thus can be 
used to treat a subject having or at risk for a disorder characterized by unwanted RAF 
expression, e.g., lung cancer or leukemia. 
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In a preferred embodiment the iRNA agent silences the Erkl/2 gene, and thus can be 
used to treat a subject having or at risk for a disorder characterized by unwanted Erkl/2 
expression, e.g,, lung cancer. 

In another preferred embodiment the iRNA agent silences the PCNA(p21) gene, and 
5 thus can be used to treat a subject having or at risk for a disorder characterized by unwanted 
PCNA expression, e.g.^ lung cancer. 

In a preferred embodiment the iRNA agent silences the MYB gene, aad thus can be 
used to treat a subject having or at risk for a disorder characterized by unwanted MYB 
expression, e,g., colon cancer or chronic myelogenous leukemia. 
10 In a preferred embodiment the iRNA agent silences the c-MYC gene, and thus can be 

used to treat a subject having or at risk for a disorder characterized by unwanted c-MYC 
expression, e.g., Burkitt's lymphoma or neuroblastoma. 

In another preferred embodiment the iRNA agent silences the JUN gene, and thus can 
be used to treat a subject having or at risk for a disorder characterized by unwanted JUN 
1 5 expression, e.g,, ovarian, prostate or breast cancers. 

In another preferred embodiment the iRNA agent silences the FOS gene, and thus can 
be used to treat a subject having or at risk for a disorder characterized by unwanted FOS 
expression, e.g., skin or prostate cancers. 

In a preferred embodiment the iRNA agent silences the BCL-2 gene, and thus can be 
20 used to treat a subject having or at risk for a disorder characterized by unwanted BCL-2 
expression, e.g., limg or prostate cancers or Non-Hodgkin lymphoma. 

In a preferred embodiment the iRNA agent silences the Cyclin D gene, and thus can 
be used to treat a subject having or at risk for a disorder characterized by unwanted Cyclin D 
expression, e.g,, esophageal and colon cancers. 
25 In a preferred embodiment the iRNA agent silences the VEGF gene, arid thus can be 

used to treat a subject having or at risk for a disorder characterized by imwanted VEGF 
expression, e.g., esophageal and colon cancers. 

In a preferred embodiment the iRNA agent silences the EGFR gene, and thus can be 
used to treat a subject having or at risk for a disorder characterized by unwanted EGFR 
30 expression, e,g,, breast cancer. 
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In another preferred embodiment the iRNA agent silences the Cyclin A gene, and 
thus can be used to treat a subject having or at risk for a disorder characterized by iinwanted 
Cyclin A expression, e.g., lung and cervical cancers. 

In another preferred embodiment the iRNA agent silences the Cyclin E gene, and thus 
can be used to treat a subject having or at risk for a disorder characterized by unwanted 
Cyclin E expression, e.g., lung and breast cancers. 

In another preferred embodiment the IRNA agent silences the WNT-1 gene, and thus 
can be used to treat a subject having or at risk for a disorder characterized by unwanted 
WNT-1 expression, e.g., basal cell carcinoma. 

In another preferred embodiment the iRNA agent silences the beta-catenin gene, and 
thus can be used to treat a subject having or at risk for a disorder characterized by unwanted 
beta-catenin expression, e.g., adenocarcinoma or hepatocellular carcinoma. 

In another preferred embodiment the iRNA agent silences the c-MET gene, and thus 
can be used to treat a subject having or at risk for a disorder characterized by unwanted c- 
MET expression, e.g., hepatocellular carcinoma. 

In another preferred embodiment the iRNA agent silences the PKC gene, and thus can 
be used to treat a subject having or at risk for a disorder characterized by unwanted PKC 
expression, e.g., breast cancer. 

In a preferred embodiment the iRNA agent silences the NFKB gene, and thus can be 
used to treat a subject having or at risk for a disorder cdiaracterized by unwanted NFKB 
expression, e.g., breast cancer. 

In a preferred embodiment the iRNA agent silences the STAT3 gene, and thus can be 
used to treat a subject having or at risk for a disorder characterized by unwanted STATS 
expression, e.g., prostate cancer. 

In another preferred embodunent the iRNA agent silences the survivin gene, and thus 
can be used to treat a subject having or at risk for a disorder characterized by unwanted 
survivin expression, e.g., cervical or pancreatic cancers. 

In another preferred embodiment the iRNA agent silences the Her2/Neu gene, and 
thus can be used to treat a subject having or at risk for a disorder characterized by unwanted 
Her2/Neu expression, e.g., breast cancer. 
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In another preferred embodiment the iRNA agent silences the topoisomerase I gene, 
and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted topoisomerase I expression, e.g-,, ovarian and colon cancers. 

In a preferred embodiment the iRNA agent silences the topoisomerase II alpha gene, 
5 and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted topoisomerase II expression, e.g,^ breast and colon cancers. 

In a preferred embodiment the iRNA agent silences mutations in the p73 gene, and 
thus can be used to treat a subject having or at risk for a disorder characterized by unwanted 
p73 expression, e.g., colorectal adenocarcmoma. 
10 In a preferred embodiment the iRNA agent silences mutations in the 

p21(WAFl/CIP 1) gene, and thus can be used to treat a subject having or at risk for a disorder 
characterized by imwanted p21(WAFl/CIPl) expression, e.g.^ liver cancer. 

In a preferred embodiment the iRNA agent silences mutations in the p27(KIPl) gene, 
and thiis can be used to treat a subject having or at risk for a disorder characterized by 
15 imwanted p27(KIPl) expression, e.g., liver cancer. 

In a preferred embodiment the iRNA agent silences mutations in the PPM ID gene, 
and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted PPMID expression, e,g.^ breast cancer. 

In a preferred embodhnent the iRNA agent silences mutations in the RAS gene, and 
20 thus can be used to treat a subject having or at risk for a disorder characterized by unwanted 
RAS expression, e.g"., breast cancer. 

In another preferred embodiment the iRNA agent silences mutations in the caveolin I 
gene, and thus can be used to treat a subject having or at risk for a disorder characterized by 
imwanted caveolin I expression, e.g., esophageal squamous cell carcinoma. 
25 In another preferred embodiment the iRNA agent silences mutations in the MIB I 

gene, and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted MIB I expression, e.g,^ male breast carcinoma (MBC). 

In another preferred embodiment the iRNA agent silences mutations in the MTAI 
gene, and thus can be used to treat a subject having or at risk for a disorder characterized by 
30 unwanted MTAI expression, e.g., ovarian carcinoma. 
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In another preferred embodiment the iRNA agent silences mutations in the M68 gene, 
and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted M68 expression, e.g., human adenocarcinomas of the esophagus, stomach, colon, 
and rectum. 

In preferred embodiments the iRNA agent silences mutations in tumor suppressor 
genes, and thus can be used as a method to promote apoptotic activity in combination with 
chemotherapeutics . 

In a preferred embodiment the iRNA agent silences mutations in the p53 tumor 
suppressor gene, and thus can be used to treat a subject having or at risk for a disorder 
characterized by xmwanted p53 expression, e.g., gall bladder, pancreatic and limg cancers. 

In a preferred embodiment the iRNA agent silences mutations in the p53 family 
member DN-p63, and thus can be used to treat a subject having or at risk for a disorder 
characterized by unwanted DN-p63 expression, e.g., squamous cell carcinoma 

In a preferred embodiment the iRNA agent silences mutations in the pRb tumor 
suppressor gene, and thus can be used to treat a subject having or at risk for a disorder 
characterized by unwanted pRb expression, e.g., oral squamous cell carcinoma 

In a preferred embodiment the iRNA agent silences mutations in the APCl tumor 
suppressor gene, and thus can be used to treat a subject having or at risk for a disorder 
characterized by unwanted APCl expression, e.g., colon cancer. 

In a preferred embodiment the iRNA agent silences mutations in the BRCAl tumor 
suppressor gene, and thus can be used to treat a subject having or at risk for a disorder 
characterized by unwanted BRCAl expression, e.g., breast cancer. 

In a preferred embodiment the iRNA agent silences mutations in the PTEN tumor 
suppressor gene, and thus can be used to treat a subject having or at risk for a disorder 
characterized by unwanted PTEN expression, e.g., hamartomas, gliomas, and prostate and 
endometrial cancers. 

In a preferred embodiment the iRNA agent silences MLL fusion genes, e.g., MLL- 
AF9, and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted MLL fusion gene expression, e.g., acute leukemias. 
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In another preferred embodiment the iRNA agent silences the BCR/ABL fusion gene, 
and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted BCR/ABL fusion gene expression, e.g., acute and chronic leukemias. 

In another preferred embodiment the iRNA agent silences the TEL/AMLl fusion 
5 gene^ and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted TEL/AMLl fusion gene expression, e.g., childhood acute leukemia. 

In another preferred embodiment the iRNA agent silences the EWS/FLIl fusion gene, 
and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted EWS/FLIl fusion gene expression, e,g., Ewing Sarcoma, 
^ 0 In another preferred embodiment the iRNA agent silences the TLS/FUS 1 fusion gene, 

and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted TLS/FUS 1 fusion gene expression, e.g.. Myxoid liposarcoma. 

In another preferred embodiment the iRNA agent silences the PAX3/FKHR fusion 
gene, and thus can be used to treat a subject having or at risk for a disorder characterized by 
15 unwanted PAX3/FKHR fiision gene expression, e,g.. Myxoid liposarcoma. 

In another preferred embodiment the iRNA agent silences the AMLl/ETO fusion 
gene, and thus can be used to treat a subject havmg or at risk for a disorder characterized by 
unwanted AMLl/ETO fusion gene expression, e.g., acute leukemia. 

In another aspect, the invention features, a method of treating a subject, e.g., a human, 
20 at risk for or afflicted with a disease or disorder that may benefit by angiogenesis inhibition 
e.g., cancer. The method includes: 

providing an iRNA agent, e.g., an iRNA agent having a structure described herein, 
which iRNA agent is homologous to and can silence, e.g., by cleavage, a gene which 
mediates angiogenesis; 
25 administering the iRNA agent to a subject, 

thereby treating the subject. 

In a preferred embodiment the iRNA agent silences the alpha v-integrin gene, and 
thus can be used to treat a subject having or at risk for a disorder characterized by unwanted 
alpha V integrin, e.g., brain tumors or tumors of epithelial origm. 
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In a preferred embodiment the iRNA agent silences the Flt-1 receptor gene, and thus 
can be used to treat a subject having or at risk for a disorder characterized by unwanted Flt-1 
receptors, eg. Cancer and irheumatoid arthritis. 

In a preferred embodiment the iRNA agent silences the tubulin gene, and thus can be 
5 used to treat a subject having or at risk for a disorder characterized by unwanted tubulin^ eg. 
Cancer and retinal neovascularization. 

In a preferred embodiment the iRNA agent silences the tubulin gene, and thus can be 
used to treat a subject having or at risk for a disorder characterized by unwanted tubulin, eg. 
Cancer and retinal neovascularization. 
10 In another aspect, the invention features a method of treating a subject infected with a 

virus or at risk for or afflicted with a disorder or disease associated with a viral infection. 
The method includes: 

providing an iRNA agent, e.g,, and iRNA agent having a structure described herein, 
which iRNA agent is homologous to and can silence, e.g., by cleavage, a viral gene of a 
15 cellular gene wlaich mediates viral fmiction, e.g., entry or growth; 

administering the iRNA agent to a subject, preferably a human subject, 

thereby treating the subject. 

Thus, the invention provides for a method of treating patients infected by the Hviman 
Papilloma Virus (HPV) or at risk for or afflicted with a disorder mediated by HPV, e.g^ 
20 cervical cancer. HPV is linked to 95% of cervical carcinomas and thus an antiviral therapy is 
an attractive method to treat these cancers and other symptoms of viral infection. 

In a preferred embodiment, the expression of a HPV gene is reduced. In another 
preferred embodiment, the HPV gene is one of the group of £2, E6, or E7. 

In a preferred embodiment the expression of a human gene that is required for HPV 
25 replication is reduced. 

The invention also includes a method of treating patients infected by the Human 
Immiinodeficiency Virus (HIV) or at risk for or afflicted with a disorder mediated by HIV, 
e.g,. Acquired Immune Deficiency Syndrome (AIDS). 

In a preferred embodiment, the expression of a HIV gene is reduced. In another 
30 preferred embodiment, the HIV gene is CCR5, Gag, or Rev. 
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In a preferred embodiment the expression of a human gene that is required for HIV 
replication is reduced. In another preferred embodiment, the gene is CD4 or TsglOl . 

The invention also includes a method for treating patients infected by the Hepatitis B 
Virus (HBV) or at risk for or afflicted with a disorder mediated by HB V, e.g,, cirrhosis and 
5 heptocellular carcinoma. 

In a preferred embodiment^ the expression of a HBV gene is reduced. In another 
preferred embodiment, the targeted HBV gene encodes one of the group of the tail region of 
the HBV core protein, the pre-cregious (pre-c) region, or the cregious (c) region. In another 
preferred embodiment, a targeted HB V-RNA sequence is comprised of the poly(A) tail. 
10 In preferred embodiment the expression of a human gene that is required for HBV 

replication is reduced. 

The invention also provides for a method of treating patients infected by the Hepatitis 
A Virus (HAV), or at risk for or afflicted with a disorder mediated by HAV. 

In a preferred embodiment the expression of a human gene that is required for HAV 
15 replication is reduced. 

The present invention provides for a method of treating patients infected by the 
Hepatitis C Virus (HCV), or at risk for or afflicted with a disorder mediated by HCV, e.g,, 
cirrhosis 

In a preferred embodiment, the expression of a HCV gene is reduced. 
20 In another preferred embodiment the expression of a himian gene that is required for 

HCV replication is reduced. 

The present invention also provides for a method of treating patients infected by the 
any of the group of Hepatitis Viral strains comprising hepatitis D, E, F, G, or H, or patients at 
risk for or afflicted with a disorder mediated by any of these strains of hepatitis. 
25 In a preferred embodiment, the expression of a Hepatitis, D, E, F, G, or H gene is 

reduced. 

In another preferred embodiment the expression of a himian gene that is required for 
hepatitis D, E, F, G or H replication is reduced. 

Methods of the invention also provide for treating patients infected by the 
30 Respkatory Syncytial Virus (RS V) or at risk for or afflicted with a disorder mediated by 
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RS V, e.g, lower respiratory tract infection in infants and childhood asthma, pneumonia and 
other complications, e.g., in the elderly. 

In a preferred embodiment, the expression of a RS V gene is reduced. In another 
preferred embodiment, the targeted HBV gene encodes one of the group of genes N, L, or P. 

In a preferred embodiment the expression of a human gene that is required for RSV 
replication is reduced. 

Methods of the invention provide for treating patients infected by the Herpes 
Simplex Virus (HSV) or at risk for or afflicted with a disorder mediated by HS V, e.g, genital 
herpes and cold sores as well as life-threatening or sight-impairing disease mainly in 
immimocompromised patients. 

In a preferred embodiment, the expression of a HSV gene is reduced. In another 
preferred embodiment, the targeted HSV gene encodes DNA polymerase or the helicase- 
primase. 

In a preferred embodiment the expression of a human gene that is required for HSV 
replication is reduced. 

The invention also provides a method for treating patients infected by the herpes 
Cytomegalovirus (CMV) or at risk for or afflicted with a disorder mediated by CMV, e.g., 
congenital virus infections and morbidity in immmiocompromised patients. 

In a preferred embodiment, the expression of a CMV gene is reduced. 

In a preferred embodiment the expression of a human gene that is required for CMV 
replication is reduced. 

Methods of the invention also provide for a method of treating patients infected by 
the herpes Epstein Barr Virus (EBV) or at risk for or afflicted with a disorder mediated by 
EBV, e.g., NK/T-cell lymphoma, non-Hodgkin lymphoma, and Hodgkin disease. 

In a preferred embodiment, the expression of a EBV gene is reduced. 

In a preferred embodiment the expression of a human gene that is required for EBV 
replication is reduced. 

Methods of the invention also provide for treating patients infected by Kaposi's 
Sarcoma-associated Herpes Virus (KSHV), also called human herpesvirus 8, or patients at 
risk for or afQicted with a disorder mediated by KSHV, e.g., Kaposi's sarcoma, multicentric 
Castleman's disease and AIDS-associated primary effusion lymphoma. 
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In a preferred embodiment, the expression of a KSHV gene is reduced. 
In a preferred embodiment the expression of a human gene that is required for KSHV 
replication is reduced. 

The invention also includes a method for treating patients infected by the JC Virus 
5 (JCV) or a disease or disorder associated Avith this virus, e.g.^ progressive multifocal 
leukoencephalopathy (PML). 

In a preferred embodiment, the expression of a JCV gene is reduced. 

In prefeiTed embodiment the expression of a human gene that is required for JCV 
replication is reduced. 

10 Methods of the invention also provide for treating patients infected by the myxovirus 

or at risk for or afflicted with a disorder mediated by myxovirus, e.g.^ influenza. 

In a preferred embodiment, the expression of a myxovirus gene is reduced. 

In a preferred embodiment the expression of a human gene that is required for 
myxovirus replication is reduced. 
15 Methods of the invention also provide for treating patients infected by the rhino virus 

or at' risk for of afflicted with a disorder mediated by rhinovirus, e.g., the common cold. 

In a preferred embodiment, the expression of a rhinovims gene is reduced. 

In prefeiTed embodiment the expression of a human gene that is required for 
rhinovirus replication is reduced. 
20 Methods of the invention also provide for treating patients infected by the coronavirus 

or at risk for of afflicted with a disorder mediated by coronavirus, e.g., the coimnon cold. 

In a preferred embodiment, the expression of a coronavirus gene is reduced. 

In preferred embodiment the expression of a himian gene that is required for 
coronavirus replication is reduced. 
25 Methods of the invention also provide for treating patients infected by the flavivirus 

West Nile or at risk for or afflicted with a disorder mediated by West Nile Virus. 

In a preferred embodiment, the expression of a West Nile Virus gene is reduced. In 
another prefeiTed embodiment, the West Nile Virus gene is one of the group comprising E, 
NS3,orNS5. 

30 In a preferred embodiment the expression of a human gene that is required for West 

Nile Vifus replication is reduced. 
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Methods of the invention also provide for treating patients infected by the St. Louis 
Encephalitis flavivirus, or at risk for or afflicted with a disease or disorder associated with 
this virus, e.g., viral haemorrhagic fever or neurological disease. 

In a preferred embodiment, the expression of a St. Louis Encephalitis gene is reduced. 
5 In a preferred embodiment the expression of a human gene that is required for St. 

Louis Encephalitis virus replication is reduced. 

Methods of the invention also provide for treating patients infected by the Tick-borne 
encephalitis flavivirus, or at risk for or afflicted with a disorder mediated by Tick-borne 
encephalitis virus, e,g,, viral haemorrhagic fever and neurological disease. 
10 In a preferred embodiment, the expression of a Tick-bome encephalitis virus gene is 

reduced. 

In a preferred embodiment the expression of a human gene that is required for Tick- 
bome encephalitis virus replication is reduced. 

Metliods of the invention also provide for methods of treating patients infected by the 
15 Murray Valley encephalitis flavivirus, which commonly results in viral haemorrhagic fever 
and neurological disease. 

In a preferred embodiment, the expression of a Murray Valley encephalitis virus gene 
is reduced. 

In a preferred embodiment the expression of a human gene that is required for Murray 
20 Valley encephalitis virus replication is reduced. 

The invention also includes methods for treating patients infected by the dengue 
flavivirus, or a disease or disorder associated with this virus, e.g., dengue haemorrhagic 
fever. 

In a preferred embodiment, the expression of a dengue virus gene is reduced. 
25 In a preferred embodiment the expression of a human gene that is required for dengue 

virus replication is reduced. 

Methods of the invention also provide for treating patients infected by the Simian 
Virus 40 (SV40) or at risk for or afflicted with a disorder mediated by SV40, e,g., 
tumorigenesis. 

30 In a preferred embodiment, the expression of a S V40 gene is reduced. 
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In a preferred embodiment the expression of a lixmian gene that is required for S V40 
replication is reduced. 

The invention also includes methods for treating patients infected by the Human T 
Cell Lymphotropic Virus (HTLV), or a disease or disorder associated with this virus, e.g,^ 
leukemia and myelopathy. 

In a preferred embodiment^ the expression of a HTLV gene is reduced. In another 
preferred embodiment the HTLVl gene is the Tax transcriptional activator. 

In a preferred embodiment the expression of a himaan gene that is required for HTLV 
replication is reduced. 

Methods of the invention also provide for treating patients infected by the Moloney- 
Murine Leukemia Virus (Mo-MuLV) or at risk for or afflicted with a disorder mediated by 
Mo-MuLV, e.g., T-cell leukemia. 

In a preferred embodiment, the expression of a Mo-MuLV gene is reduced. 

In a preferred embodiment the expression of a human gene tlaat is required for Mo- 
MuLV replication is reduced. 

Methods of the invention also provide for treating patients infected by the 
encephalomyocarditis virus (EMCV) or at risk for or afflicted with a disorder mediated by 
EMCV, e,g, myocarditis. EMCV leads to myocarditis in mice and pigs and is capable of 
infecting human myocardial cells. This virus is therefore a concern for patients undergoing 
xenotransplantation. 

In a preferred embodiment, the expression of a EMCV gene is reduced. 

In a preferred embodiment the expression of a human gene that is required for EMCV 
replication is reduced. 

The invention also includes a method for treating patients infected by the measles 
virus (MV) or at risk for or afflicted with a disorder mediated by MV, e.g. measles. 

In a preferred embodiment, the expression of a MV gene is reduced. 

In a preferred embodiment the expression of a human gene that is required for MV 
replication is reduced. 

The invention also includes a method for treating patients infected by the Vericella 
zoster virus (VZV) or at risk for or afflicted with a disorder mediated by VZV, e.g, chicken 
pox or shingles (also called zoster). 
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In a preferred embodiment, the expression of a VZV gene is reduced. 
In a preferred embodiment the expression of a human gene that is required for VZV 
replication is reduced. 

The invention also includes a method for treating patients infected by an adenovhus 
or at risk for or afflicted with a disorder mediated by an adenovkus, e,g, respiratory tract 
infection. 

In a preferred embodiment, the expression of an adenovirus gene is reduced. 
In a preferred embodiment the expression of a human gene tliat is required for 
adenovums replication is reduced. 

The invention includes a method for treating patients infected by a yellow fever virus 
(YFV) or at risk for or afflicted with a disorder mediated by a YFV, e.g. respiratory tract 
infection. 

In a preferred embodiment, the expression of a YFV gene is reduced. In another 
prefeiTed embodiment, the prefeiTed gene is one of a group that includes the E, NS2A, or 
NS3 genes. 

In a preferred embodiment the expression of a human gene that is required for YFV 
replication is reduced. 

Methods of the invention also provide for treating patients infected by the poliovirus 
or at risk for or afflicted with a disorder mediated by poliovirus, e,g,, polio. 

In a preferred embodiment, the expression of a poliovirus gene is reduced. 

In a preferred embodiment tlie expression of a human gene that is required for 
poliovirus replication is reduced. 

Methods of the invention also provide for treating patients infected by a poxvirus or 
at risk for or afflicted with a disorder mediated by a poxvirus, e.g,, smallpox 

In a preferred embodiment, the expression of a poxvirus gene is reduced. 

In a preferred embodiment the expression of a human gene that is required for 
poxvirus replication is reduced. 

In another, aspect the invention features methods of treating a subject infected with a 
pathogen, e.g., a bacterial, amoebic, parasitic, or fungal pathogen. The method includes: 

providing a iRNA agent, e.g., a siRNA having a structure described herein, where 
siRNA is homologous to and can silence, e.g., by cleavage of a pathogen gene; 
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administering tlie IRNA agent to a subject, prefereably a human subject, 
thereby treating the subject. 

The target gene can be one involved in growth, cell wall synthesis, protein synthesis, 
transcription, energy metabolism, e.g,, the Krebs cycle, or toxin production. 

Thus, the present invention provides for a method of treating patients infected by a 
Plasmodium that causes malaria. 

In a preferred embodiment, the expression of a Plasmodium gene is reduced. In 
another preferred embodiment, the gene is apical membrane antigen 1 (AMAl). 

In a preferred embodiment the expression of a human gene that is required for 
Plasmodium replication is reduced. 

The invention also includes methods for treating patients infected by the 
Mycobacterium ulcerans, or a disease or disorder associated with this patliogen, e.g, Buruli 
ulcers. 

In a prefen-ed embodiment, the expression of a Mycobacterium ulcerans gene is 
reduced. 

In a preferred embodiment the expression of a human gene that is required for 
Mycobacterium ulcerans replication is reduced. 

The invention also includes methods for treatmg patients infected by the 
Mycobacterium tuberculosis, or a disease or disorder associated with this pathogen, e.g. 
tuberculosis. 

In a preferred embodiment, the expression of a Mycobacterium tuberculosis gene is 
reduced. 

In a preferred embodiment the expression of a human gene that is required for 
Mycobacterium tuberculosis replication is reduced. 

The invention also includes methods for treating patients infected by the 
Mycobacterium leprae, or a disease or disorder associated with this pathogen, e.g. leprosy. 

In a preferred embodiment, the expression of a Mycobacterium leprae gene is 
reduced. 

In a preferred embodiment the expression of a human gene tliat is required for 
Mycobacteriimi leprae replication is reduced. 
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The invention also includes methods for treating patients infected by the bacteria 
Staphylococcus aureus, or a disease or disorder associated with this pathogen, e,g. infections 
of the skin and muscous membranes. 

In a preferred embodiment, the expression of a Staphylococcus aureus gene is 
reduced. 

In a preferred embodiment the expression of a human gene that is required for 
Staphylococcus aureus replication is reduced. 

The invention also includes methods for treating patients infected by the bacteria 
Streptococcus pneumoniae, or a disease or disorder associated with this pathogen, e.g. 
pneumonia or childhood lower respiratory tract infection. 

In a preferred embodiment, the expression of a Streptococcus pneumoniae gene is 
reduced. 

In a preferred embodiment the expression of a human gene that is required for 
Streptococcus pneumoniae replication is reduced. 

The invention also includes methods for treating patients infected by the bacteria 
Streptococcus pyogenes, or a disease or disorder associated with this pathogen, e.g. Strep 
throat or Scarlet fever. 

In a preferred embodiment, the expression of a Streptococcus pyogenes gene is 
reduced. 

In a preferred embodiment the expression of a hmnan gene that is required for 
Streptococcus pyogenes replication is reduced. 

The invention also includes methods for treating patients infected by the bacteria 
Chlamydia pneumoniae, or a disease or disorder associated with this pathogen, e.g. 
pneumonia or childhood lower respiratory tract infection 

In a preferred embodiment, the expression of a Chlamydia pneumoniae gene is 
reduced. 

In a preferred embodiment the expression of a human gene that is required for 
Chlamydia pneumoniae replication is reduced. 

The invention also includes methods for treating patients infected by the bacteria 
Mycoplasma pnemnoniae, or a disease or disorder associated with this pathogen, e.g. 
pneumonia or childhood lower respiratory tract infection 
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In a preferred embodiment, the expression of a Mycoplasma pneumoniae gene is 
reduced. 

In a preferred embodiment the expression of a human gene that is required for 
Mycoplasma pneumoniae replication is reduced. 

In one aspect, the invention features, a method of treating a subject, e,g,, a human, at 
risk for or afflicted with a disease or disorder characterized by an imwanted immune 
response, e.g., an inflammatory disease or disorder, or an autoimmune disease or disorder. 
The method includes: 

providing an iRNA agent, e.g., an iRNA agent having a structure described herein, 
which iRNA agent is homologous to and can silence, e.g., by cleavage, a gene which 
mediates an unwanted immune response; 

administering the iRNA agent to a subject, 

thereby treating tlie subject. 

In a preferred embodiment the disease or disorder is an ischemia or reperfusion 
injury, e.g., ischemia or reperfusion injury associated with acute myocardial infarction, 
unstable angina, cardiopulmonary bypass, surgical intervention e.g., angioplasty, e.g., 
percutaneous transluminal coronary angioplasty, the response to a transplantated organ or 
tissue, e.g., transplanted cardiac or vascular tissue; or thrombolysis. 

In a preferred embodiment the disease or disorder is restenosis, e.g., restenosis 
associated with surgical intervention e.g., angioplasty, e.g., percutaneous transluminal 
coronary angioplasty. 

In a prefered embodiment the disease or disorder is Inflaimnatory Bowel Disease, 
e.g., Crohn Disease or Ulcerative Colitis. 

In a prefered embodiment the disease or disorder is inflammation associated with an 
infection or injury. 

In a prefered embodiment the disease or disorder is asthma, lupus, multiple sclerosis, 
diabetes, e.g., type II diabetes, arthritis, e.g., rheumatoid or psoriatic. 

In particularly preferred embodiments the iRNA agent silences an integrin or co- 
ligand thereof, e.g., VLA4, VCAM, ICAM. 

In particularly preferred embodiments the iRNA agent silences a selectin or co-ligand 
thereof, e.g., P-selectin, E-selectin (ELAM), I-selectin, P-selectin glycoprotein- 1 (PSGL-1). 
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In particularly preferred embodiments the iRNA agent silences a component of the 

V 

complement system, e.g., C3, C5, C3aR, CSaR, C3 convertase, C5 convertase. 

In particularly preferred embodiments the iRNA agent silences a chemokine or 
receptor thereof, e,g,, TNFI, TOFJ, IL-II, IL-IJ, IL -2, IL-2R, IL-4, IL-4R, IL-5, IL-6, IL-8, 
5 TNFRI, TNFRII, IgE, SC YA 1 1 , CCR3 . 

In other embodiments the iRNA agent silences GCSF, Grol, GroZ, GroS, PF4, MIG, 
Pro-Platelet Basic Protein (PPBP), MIP-II, MIP-IJ, RANTES, MCP-1, MCP-2, MCP-3, 
CMBKRl, CMBKR2, CMBKR3, CMBKR5, AIF-1, 1-309. 

In one aspect, the invention features, a method of treating a subject, e,g.^ a human, at 
10 risk for or afflicted with acute pain or chronic pain. The method includes: 

providing an iRNA agent, which iRNA is homologous to and can silence, e.g., by 
cleavage, a gene which mediates the processing of pain; 

administering the iRNA to a subject, 

thereby treating the subject. 
15 In particularly preferred embodiments the iRNA agent silences a component of an ion 

channel. 

In particularly preferred embodiments the iRNA agent silences a neurotransmitter 
receptor or ligand. 

In one aspect, the invention features, a method of treating a subject, e.g., a human, at 
20 risk for or afflicted with a neurological disease or disorder. The method includes: 

providing an iRNA agent which iRNA is homologous to and can silence, e.g., by 
cleavage, a gene which mediates a neurological disease or disorder; 
administering the to a subject, 
thereby treating the subject. 
25 In a prefered embodiment the disease or disorder is Alzheimer Disease or Parkinson 

Disease. 

In particularly preferred embodiments the iRNA agent silences an amyloid- family 
gene, e.g., APP; a presenilin gene, e.g., PSENl and PSEN2, or I-synuclein. 

In a preferred embodiment the disease or disorder is a neurodegenerative trinucleotide 
30 repeat disorder, e.g., Huntington disease, dentatorubral pallidoluysian atrophy or a 

spinocerebellar ataxia, e.g., SCAl, SCA2, SCA3 (Machado- Joseph disease), SCA7 or SCA8. 



207 



wo 2004/080406 



PCT/US2004/007070 



In particularly preferred embodiments the iRNA agent silences HD, DRPLA, SCAl, SCA2, 
MJDl, CACNL1A4, SCA7, SCA8. 

The loss of heterozygosity (LOH) can result in hemizygosity for sequence, e.g,, 
genes, in the area of LOH. This can result in a significant genetic difference between normal 
and disease-state cells, e.g,, cancer cells, and provides a useful difference between normal 
and disease-state cells, e,g,, cancer cells. This difference can arise because a gene or other 
sequence is heterozygous in euploid cells but is hemizygous in cells having LOH. The 
regions of LOH will often include a gene, the loss of which promotes unwanted proliferation, 
e.g., a tumor suppressor gene, and other sequences mcluding, e.g., otlier genes, in some cases 
a gene which is essential for normal function, e.g., growth. Methods of the invention rely, in 
part, on the specific cleavage or silencing of one allele of an essential gene with an iRNA 
agent of the invention. The iRNA agent is selected such that it targets the single allele of the 
essential gene found in the cells having LOH but does not silence the other allele, which is . 
present in cells which do not show LOH, In essence, it discriminates between the two 
alleles, preferentially silencing tlie selected allele. In essence polymorphisms, e.g., SNPs of 
essential genes that are affected by LOH, are used as a target for a disorder characterized by 
cells having LOH, e.g., cancer cells having LOH. 

E.g., one of ordinary skill in the art can identify essential genes which are in 
proxknity to tumor suppressor genes, and which are within a LOH region which includes the 
tumor suppressor gene. The gene encoding the large subunit of human RNA polymerase II, 
POLR2A, a gene located in close proximity to the tumor suppressor gene p53, is such a gene. 
It fiequently occurs within a region of LOH in cancer cells. Other genes that occur within 
LOH regions and are lost in many cancer cell types include the group comprismg replication 
protein A 70-kDa subunit, replication protein A 32-kD, ribonucleotide reductase, tliymidilate 
synthase, TATA associated factor 2H, ribosomal protein S14, eukaryotic initiation factor 5 A, 
alanyl tRNA synthetase, cysteinyl tRNA synthetase, NaK ATPase, alpha- 1 subunit, and 
transferrin receptor. 

Accordingly, the invention features, a method of treating a disorder characterized by 
LOH, e.g., cancer. The method includes: 

optionally, determining the genotype of the allele of a gene in the region of LOH and 
preferably determining tlie genotype of both alleles of the gene in a normal cell; 
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providing an iRNA agent which preferentially cleaves or silences the allele found in 
the LOH cells; 

administerning the iRNA to the subject, 
thereby treating the disorder. 

The invention also includes a iRNA agent disclosed herein, e.g^ an iRNA agent which 
can preferentially silence^ e.g., cleave, one allele of a polymorphic gene 

In another aspect^ the invention provides a method of cleaving or silencing more than 
one gene with an iRNA agent. In these embodiments the iRNA agent is selected so that it 
has sufficient homology to a sequence found in more than one gene. For example, the 
sequence AAGCTGGCCCTGGACATGGAGAT (SEQ ID NO:6736) is conserved between 
mouse lamin Bl, lamin B2, keratin complex 2-gene 1 and lamin A/C. Thus an iRNA agent 
targeted to this sequence would effectively silence the entire collection of genes. 

L 

The invention also includes an iRNA agent disclosed herein, which can silence more 
than one gene. 

ROUTE OF DELIVERY 

For ease of exposition the formulations, compositions and methods in this section are 
discussed largely with regard to unmodified iRNA agents. It should be understood, however, 
that these formulations, compositions and methods can be practiced with other iRNA agents, 
e.g., modified iRNA agents, and such practice is within the invention. A composition that 
includes a iRNA can be delivered to a subject by a variety of routes. Exemplary routes 
include: intravenous, topical, rectal, anal, vaginal, nasal, pulmonary, ocular. 

The iRNA molecules of the invention can be incorporated into pharmaceutical 
compositions suitable for administration. Such compositions typically include one or more 
species of iRNA and a pharmaceutically acceptable carrier. As used herein the language 
"pharmaceutically acceptable carrier" is intended to include any and all solvents, dispersion 
media, coatings, antibacterial and emtifungal agents, isotonic and absorption delaying agents, 
and the like, compatible with pharmaceutical administration. The use of such media and 
agents for pharmaceutically active substances is well known in the art. Except insofar as any 
conventional media or agent is incompatible with the active compound, use thereof in the 
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compositions is contemplated. Supplementary active compounds can also be incorporated 
into the compositions. 

The pharmaceutical compositions of the present invention may be administered in a 
number of ways depending upon whether local or systemic treatment is desired and upon the 
5 area to be treated. Administration may be topical (including ophthalmic^ vagmal, rectal, 
intranasal, transdermal), oral or parenteral. Parenteral administration includes intravenous 
drip, subcutaneous, intraperitoneal or intramuscular injection, or intrathecal or 
intraventricular administration. 

The route and site of administration may be chosen to enhance targeting. For 
10 example, to target muscle cells, intramuscular mjection into the muscles of interest would be 
a logical choice. Lxmg cells might be targeted by administering the iRNA in aerosol form. 
The vascular endothelial cells could be targeted by coating a balloon catheter with the iRNA 
and mechanically introducing the DNA. 

Formulations for topical administration may include transdermal patches, ointments, 
15 lotions, creams, gels, drops, suppositories, sprays, liquids and powders. Conventional 
pharmaceutical carriers, aqueous, powder or oily bases, thickeners and the like may be 
necessary or desirable. Coated condoms, gloves and the like may also be useful. 

Compositions for oral administi'ation include powders or granules, suspensions or 
solutions in water, syrups, elixirs or non-aqueous media, tablets, capsules, lozenges, or 
20 troches. In the case of tablets, carriers that can be used include lactose, sodium citrate and 
salts of phosphoric acid. Various disintegrants such as starch, and lubricating agents such as 
magnesium stearate, sodium lauryl sulfate and talc, are commonly used m tablets. For oral 
administration in capsule form, useful diluents ai'e lactose and high molecular weight 
polyethylene glycols. When aqueous suspensions are required for oral use, the nucleic acid 
25 compositions can be combined with emulsifying and suspending agents. If desired, certain 
sweetening and/or flavoring agents can be added. 

Compositions for intrathecal or intraventricular administration may include sterile 
aqueous solutions which may also contain buffers, diluents and other suitable additives. 

Formulations for parenteral administration may include sterile aqueous solutions 
30 which may also contain buffers, diluents and other suitable additives. Intraventricular 
injection may be facilitated by an intraventricular catheter, for example, attached to a 
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reservoir. For intravenous use, the total concentration of solutes should be controlled to 
render the preparation isotonic. 

For ocular administration, ointments or droppable liquids may be delivered by ocular 
delivery systems known to the art such as applicators or eye droppers. Such compositions can 

5 include mucomimetics such as hyaluronic acid, chondroitin sulfate, hydroxypropyl 
methylcellulose or poly(vinyl alcohol), preservatives such as sorbic acid, EDTA or 
benzylchronium chloride, and the usual quantities of diluents and/or carriers. 

Topical Delivery 

For ease of exposition the formulations, compositions and methods in this section are 

10 discussed largely with regard to unmodified iRNA agents. It should be understood, hov^ever, 
that these formulations, compositions and methods caa be practiced v\^ith other iRNA agents, 
e.g., modified iRNA agents, and such practice is within the invention. In a preferred 
embodiment, an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a 
precursor, e.g., sl larger iRNA agent which can be processed into a sRNA agent, or a DNA 

15 which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof) is delivered to a subject via topical administration. "Topical 
administration" refers to the delivery to a subject by contacting the formulation directly to a 
surface of the subject. The most common form of topical delivery is to the skin, but a 
composition disclosed herein can also be directly applied to other surfaces of the body, e.g., 

20 to the eye, a mucous membrane, to surfaces of a body cavity or to an intemal surface. As 
mentioned above, the most common topical delivery is to the skin. The term encompasses 
several routes of administration including, but not limited to, topical and transdermal. These 
modes of administration typically include penetration of the skin's permeability barrier and 
efficient delivery to the target tissue or stratimi. Topical administration can be used as a 

25 means to penetrate the epidermis and dermis and ultimately achieve systemic delivery of the 
composition. Topical administration can also be used as a means to selectively deliver 
oligonucleotides to the epidermis or dermis of a subject, or to specific strata thereof, or to an 
underlying tissue. 

The term "skin," as used herein, refers to the epidermis and/or dermis of an animal. 
30 Mammalian skin consists of two major, distinct layers. The outer layer of the skin is called 
the epidermis. The epidermis is comprised of the stratum comeum, the stratum granulosum, 
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the stratum spinosum, and the stratum basale, with the stratum comeum being at the surface 
of the skin and the stratum basale being the deepest portion of the epidermis. The epidermis 
is between 50 ^m and 0.2 mm thick, depending on its location on the body. 

Beneath the epidermis is the dermis, which is significantly thicker than the epidermis. 
The dermis is primarily composed of collagen in the form of fibrous bundles. The 
collagenous bundles provide support for, inter alia, blood vessels, lymph capillaries, glands, 
nerve endings and immunologically active cells. 

One of the major functions of the skin as an organ is to regulate the entry of 
substances into the body. The principal permeability barrier of the skm is provided by the 
stratum corneum, which is formed fi:om many layers of cells in various states of 
differentiation. The spaces between cells in the stratum comeum is filled with different 
lipids arranged in lattice-like formations that provide seals to further enhance tlie skins 
permeability barrier. 

The permeability barrier provided by the skin is such that it is largely impermeable to 
molecules having molecular weiglit greater than about 750 Da. For larger molecules to cross 
the skin's permeability barrier, mechanisms other than normal osmosis must be used. 

Several factors determine the permeabiUty of the skin to administered agents. These 
factors include the characteristics of the treated skin, the characteristics of the delivery agent, 
interactions between both the drug and delivery agent and the drug and skin, the dosage of 
the drug applied, the form of treatment, and the post treatment regimen. To selectively target 
the epidermis and dermis, it is sometimes possible to formulate a composition that comprises 
one or more penetration enhancers that will enable penetration of the drug to a preselected 
strattun. 

Transdermal delivery is a valuable route for the administration of lipid soluble 
therapeutics. The dermis is more permeable than the epidermis and therefore absorption is 
much more rapid through abraded, burned or denuded skin. Inflammation and other 
physiologic conditions that increase blood flow to the skin also enhance transdermal 
adsorption. Absorption via this route may be enhanced by the use of an oily vehicle 
(inunction) or through the use of one or more penetration enhancers. Other effective ways to 
deliver a composition disclosed herein via the transdermal route include hydration of the skin 
and the use of controlled release topical patches. The transdermal route provides a 
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potentially effective means to deliver a composition disclosed herein for systemic and/or 
local therapy. 

In addition, iontophoresis (transfer of ionic solutes through biological membranes 
under the influence of an electric field) (Lee et al.^ Critical Reviews in Therapeutic Drug 

5 Carrier Systems, 1991, p. 163), phonophoresis or sonophoresis (use of ultrasound to enhance 
the absorption of various therapeutic agents across biological membranes, notably the skin 
and the cornea) (Lee et aL^ Critical Reviev/s in Therapeutic Drug Carrier Systems, 1991, p. 
166), and optimization of vehicle characteristics relative to dose position and retention at the 
site of administration (Lee et al.^ Critical Reviews in Therapeutic Drug Carrier Systems, 

10 1991, p. 168) may be useful methods for enhancing the transport of topically applied 
compositions across skin and mucosal sites. 

The compositions and methods provided may also be used to examine the function of 
various proteins and genes in vitro in cultured or preserved dermal tissues and in animals. 
The invention can be thus applied to examine the function of any gene. The methods of the 

15 invention can also be used therapeutically or prophylactically. For example, for the 

treatment of animals that are known or suspected to suffer from diseases such as psoriasis, 
lichen planus, toxic epidermal necrolysis, ertythema multiforme, basal cell carcinoma, 
squamous cell carcinoma, malignant melanoma, Paget's disease, Kaposi's sarcoma, 
pulmonary fibrosis, Lyme disease and viral, fungal and bacterial infections of the skin. 

20 

Pulmonary Delivery 

For ease of exposition the formulations, compositions and methods in this section are 
discussed lai'gely with regard to unmodified iRNA agents. It should be understood, however, 
that these formulations, compositions and methods can be practiced with other iRNA agents, 

25 e.g., modified iRNA agents, and such practice is within the invention. A composition that 
includes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a 
precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a DNA 
which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof) can be administered to a subject by pulmonary delivery. Pulmonary 

30 delivery compositions can be delivered by inhalation by the patient of a dispersion so that the 
composition, preferably iRNA, within the dispersion can reach the lung where it can be 
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readily absorbed through the alveolar region directly into blood circulation. Pulmonary 
delivery can be effective both for systemic delivery and for localized delivery to treat 
diseases of the lungs. 

Pulmonary delivery can be achieved by different approaches, including the use of 
nebulized, aerosolized, micellular and dry powder-based formulations. Delivery can be 
achieved with liquid nebulizers^ aerosol-based inlialers, and dry powder dispersion devices. 
Metered-dose devices are prefen^ed. One of the benefits of using an atomizer or inhaler is 
that the potential for contamination is minimized because the devices are self contained. Dry 
powder dispersion devices, for example^ deliver drugs that may be readily formulated as dry 
powders. A iRNA composition may be stably stored as lyophilized or spray-dried powders 
by itself or in combination with suitable powder carriers. The delivery of a composition for 
inhalation can be mediated by a dosing timmg element which can include a timer, a dose 
counter, time measuring device, or a time indicator which when incorporated into the device 
enables dose tracking, compliance monitoring, and/or dose triggering to a patient during 
administration of the aerosol medicament. 

The term ''powder" means a composition that consists of finely dispersed solid 
particles that are free flowing and capable of being readily dispersed in an inhalation device 
and subsequently inhaled by a subject so that the particles reach the Ivmgs to permit 
penetration into the alveoli. Thus, the powder is said to be ''respirable." Preferably the 
average particle size is less than about 1 0 |im in diameter preferably with a relatively uniform 
spheroidal shape distribution. More preferably the diameter is less than about 7.5 |Lmi and 
most preferably less than about 5.0 |xm. Usually the particle size distribution is between 
about 0,1 |jm and about 5 ja,m in diameter, particularly about 0.3 to about 5 |Lim. 

The term "dry" means that the composition has a moisture content below about 10% 
by weight (% w) water, usually below about 5% w and preferably less it than about 3% w. A 
dry composition can be such that the particles are readily dispersible in an inhalation device 
to form an aerosol. 

The term "therapeutically effective amount" is the amount present in the composition 
that is needed to provide tlie desired level of drug in the subject to be treated to give the 
anticipated physiologiced response. 
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The term "physiologically effective amomit^ is that amoimt delivered to a subject to 
give the desired palliative or curative effect. 

The term "pharmaceutically acceptable carrier" means that the carrier can be taken 
into the lungs with no significant adverse toxicological effects on the lungs. 

The types of pharmaceutical excipients that are useful as carrier include stabilizers 
such as human serum albumin (HSA), bulking agents such as carbohydrates, amino acids and 
polypeptides; pH adjusters or buffers; salts such as sodixim chloride; and the like. These 
carriers may be in a crystalline or amorphous form or may be a mixture of the two. 

Bulking agents that are particularly valuable include compatible carbohydrates ^ 
polypeptides, amino acids or combinations thereof. Suitable carbohydrates include 
monosaccharides such as galactose, D-mannose, sorbose, and the like; disaccharides, such as 
lactose, trehalose, and the like; cyclodextrins, such as 2-hydroxypropyl-.beta.-cyclodextrin; 
and polysaccharides, such as raffinose, maltodextrins, dextrans, and the like; alditols, such as 
mannitol, xylitol, and the Uke. A preferred group of carbohydrates includes lactose, 
threhalose, raffinose maltodextrins, and maimitol. Suitable polypeptides include aspartame. 
Amino acids include alanine and glycine, with glycine being prefeixed. 

Additives, which are minor components of the composition of this invention, may be 
included for confonnational stability during spray drying and for improving dispersibility of 
the powder. These additives include hydrophobic amino acids such as tryptophan, tyrosine, 
leucine, phenylalanine, and the like. 

Suitable pH adjusters or buffers include organic salts prepared from organic acids and 
bases, such as sodium citrate, sodium ascorbate, and the like; sodium citrate is preferred. 

Pulmonary administration of a micellar iRNA formulation may be achieved through 
metered dose spray devices with propellants such as tetrafluoroethane, heptafluoroethane, 
dimethylfluoropropane, tetrafluoropropane, butane, isobutane, dimethyl ether and other non- 
CFC and CFC propellants. 

Oral or Nasal Delivery 

For ease of exposition the formulations, compositions and methods in this section are 

discussed largely with regard to unmodified iRNA agents. It should be understood, however, 

tliat these fomiulations, compositions and methods can be practiced with other iRNA agents, 

e.g., modified iRNA agents, and such practice is within the invention. Both the oral and 
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nasal membranes offer advantages over other routes of administration. For example, drugs 
administered through these membranes have a rapid onset of action, provide therapeutic 
plasma levels, avoid first pass effect of hepatic metabolism, and avoid exposure of the drug 
to the hostile gastrointestinal (QI) environment. Additional advantages include easy access 
to the membrane sites so that the drug can be applied, localized and removed easily. 

In oral deUvery, compositions can be targeted to a surface of the oral cavity, e.g., to 
sublingual mucosa v/hich includes the membrane of ventral surface of the tongue and the 
floor of the mouth or the buccal mucosa which constitutes the lining of the cheek. The 
sublingual mucosa is relatively permeable thus giving rapid absorption and acceptable 
bioavailability of many drugs. Fmther, the sublingual mucosa is convenient, acceptable and 
easily accessible. 

The ability of molecules to permeate through the oral mucosa appears to be related to 
molecular size, lipid solubility and peptide protein ionization. Small molecules, less than 
1000 daltons appear to cross mucosa rapidly. As molecular size increases, the permeability 
decreases rapidly. Lipid soluble compounds are more permeable than non-lipid soluble 
molecules. Maximxam absorption occurs when molecules are un-ionized or neutral in 
electrical charges. Therefore charged molecules present the biggest challenges to absorption 
through the oral mucosae. 

A pharmaceutical composition of iRNA may also be administered to the buccal cavity 
of a himian being by spraying into the cavity, without inhalation, fi:om a metered dose spray 
dispenser, a mixed micellar pharmaceutical formulation as described above and a propellant. 
In one embodiment, the dispenser is first shaken prior to spraying the pharmaceutical 
formulation and propellant into the buccal cavity. 

Devices 

For ease of exposition the devices, formulations, compositions and methods in this 

section are discussed largely with regard to unmodified iRNA agents. It should be 

understood, however, that these devices, formulations, compositions and methods can be 

practiced with, other iRNA agents, e.g., modified iRNA agents, and such practice is within the 

invention. An iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a 

precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a DNA 

which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
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precursor thereof) can be disposed on or in a device, e.g., a device which implanted or 
otherwise placed in a subject. Exemplary devices include devices which are introduced into 
the vasculature, e.g., devices inserted into the lumen of a vascular tissue, or which devices 
themselves form a part of the vasculature, including stents, catheters, heart valves, and other 
5 vascular devices. These devices^ e,g.^ catheters or stents, can be placed in the vasculature of 
the lung, heart, or leg. 

Other devices include non-vascular devices, e.g., devices implanted in the 
peritoneum, or in organ or glandular tissue, e.g., artificial organs. The device can release a 
therapeutic substance in addition to a iRNA, e.g., a device can release insulin. 
10 Other devices include artificial joints, e.g,, hip joints, and other orthopedic implants. 

In one embodiment, xmit doses or measured doses of a composition that includes 
iRNA are dispensed by an implanted device. The device can include a sensor that monitors a 
parameter withm a subject. For example, the device can include pump, e.g., and, optionally, 
associated electronics. 

15 Tissue, e.g., cells or organs can be treated with An iRNA agent, e.g., a double- 

stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can 
be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, or precursor thereof) ^ ex vivo and then administered or 
implanted in a subject. 

20 The tissue can be autologous, allogeneic, or xenogeneic tissue. E.g., tissue can be 

treated to reduce graft v. host disease. In other embodiments, the tissue is allogeneic and the 
tissue is treated to treat a disorder characterized by unwanted gene expression in that tissue. 
E.g., tissue, e.g., hematopoietic cells, e.g., bone marrow hematopoietic cells, can be treated to 
inhibit unwanted cell proliferation. 

25 Introduction of treated tissue, whether autologous or transplant, can be combined with 

other therapies. 

In some implementations, the iRNA treated cells are insulated from other cells, e.g., 
by a semi-permeable porous barrier that prevents the cells from leaving the implant, but 
enables molecules from the body to reach the cells and molecules produced by the cells to 
30 enter the body. In one embodiment, the porous barrier is formed from alginate. 
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In one embodiment, a contraceptive device is coated with or contains an iRNA agent, 
e,g., a double-stranded iRNA agent, or sRNA agent, (e,g., a precursor, e.g., a larger iRNA 
agent v^hich can be processed into a sRNA agent, or a DNA which encodes an iRNA agent, 
e,g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof). Exemplary 

devices include condoms, diaphragms, lUD (implantable uterine devices, sponges, vaginal 
sheatlis^ and birth control devices. In one embodiment, the iRNA is chosen to inactive sperm 
or egg. In another embodiment, the iRNA is chosen to be complementary to a viral or 
pathogen RNA, e.g., an RNA of an STD. In some instances, the iRNA composition can 
include a spermicide. 

DOSAGE 

In one aspect, the invention features a method of administering an iRNA agent, e.g,^ a 
double-stranded iRNA agent, or sRNA agent, to a subject (e.g., a human subject). The 
method includes administering a imit dose of the iRNA agent, e.g., a sRNA agent, e.g., 
double stranded sRNA agent that (a) the double-stranded part is 1 9-25 nucleotides (nt) long, 
preferably 21-23 nt, (b) is complementary to a target RNA (e.g., an endogenous or pathogen 
target RNA), and, optionally, (c) includes at least one 3' overhang 1-5 nucleotide long. In 
one embodiment, the unit dose is less than 1 .4 mg per kg of body weight, or less than 10, 5, 2, 
1, 0.5, 0.1, 0.05, 0.01, 0.005, 0.001, 0.0005, 0.0001, 0.00005 or 0.00001 mg per kg of 
bodyweight, and less than 200 mnole of RNA agent (e.g. about 4.4 x 10^^ copies) per kg of 
bodyweight, or less than 1500, 750, 300, 150, 75, 15, 7.5, 1.5, 0.75, 0.15, 0.075, 0.015, 
0.0075, 0.0015, 0.00075, 0.00015 nmole of RNA agent per kg of bodyweight. 

The defined amount can be an amount effective to treat or prevent a disease or 
disorder, e.g., a disease or disorder associated with the target RNA. The unit dose, for 
example, can be administered by injection (e.g., intravenous or intramuscular), an inhaled 
dose, or a topical application. Particularly preferred dosages are less than 2, 1, or 0.1 mg/kg 
of body weight. 

In a preferred embodiment, the unit dose is administered less frequently than once a 
day, e.g., less than every 2, 4, 8 or 30 days. In another embodiment, the unit dose is not 

administered with a frequency (e.g., not a regular frequency). For example, the miit dose 
may be administered a single time. 
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In one embodiment, the effective dose is administered with other traditional 
therapeutic modalities. In one embodiment, the subject has a viral infection and the modality 
is an antiviral agent other than an iRNA agent, e.g,, other than a double-stranded iRNA 
agent, or sRNA agent,. In another embodiment, the subject has atherosclerosis and the 
effective dose of an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, is 
administered in combination with, e.g.^ after surgical intervention, e.g^., angioplasty. 

In one embodiment, a subject is administered an initial dose and one or more 
maintenance doses of an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
(e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof). The maintenance dose or doses are generally lower than the initial dose, 
e.g., one-half less of the initial dose, A maintenance regimen can include treating tlie subject 
with a dose or doses ranging from 0.01 lug to 1.4 mg/kg of body weight per day, e.g., 10, 1, 
0.1, 0.01, 0.001, or 0.00001 mg per kg of bodyweight per day. The maintenance doses are 
preferably administered no more than once every 5, 10, or 30 days. Further, the treatment 
regimen may last for a period of time which will vary depending upon the nature of the 
particular disease, its severity and the overall condition of the patient. In preferred 
embodiments the dosage may be delivered no more than once per day, e.g., no more than 
once per 24, 36, 48, or more hours, e.g., no more than once for every 5 or 8 days. Following 
treatment, the patient can be monitored for changes in his condition and for alleviation of the 
symptoms of the disease state. The dosage of the compound may either be increased in the 
event the patient does not respond significantly to current dosage levels, or the dose may be 
decreased if an alleviation of the symptoms of the disease state is observed, if the disease 
state has been ablated, or if undesired side-effects are observed. 

The effective dose can be administered in a single dose or in two or more doses, as 
desired or considered appropriate under the specific circumstances. If desired to facilitate 
repeated or frequent infusions, implantation of a delivery device, e.g., a pvunp, semi- 
permanent stent (e.g., intravenous, intraperitoneal, intracisternal or intracapsular), or 
reservoir may be advisable. 

In one embodiment, the iRNA agent pharmaceutical composition includes a plurality 
of iRNA agent species. In another embodiment, the iRNA agent species has sequences that 
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are non-overlapping and non-adjacent to another species with respect to a naturally occurring 
target sequence. In another embodiment, the plurality of iRNA agent species is specific for 
different naturally occxirring target genes. In another embodiment, the iRNA agent is allele 
specific. 

5 In some cases, a patient is treated with a iRNA agent in conjunction with other 

therapeutic modaUties. For example, a patient being treated for a vkal disease,, e.g. an HIV 
associated disease (e.g., AIDS), may be administered a iRNA agent specific for a target gene 
essential to the virus in conjunction with a known antiviral agent (e.g-., a protease inhibitor or 
reverse transcriptase inhibitor). In another example, a patient being treated for cancer may be 

10 administered a iRNA agent specific for a target essential for tumor cell proliferation in 
conjunction with a chemotherapy. 

Following successfiil treatment, it may be desirable to have the patient undergo 
maintenance therapy to prevent the recurrence of the disease state, wherein the compound of 
the invention is administered in maintenance doses, ranging fi:om 0.01 ^g to 100 g per kg of 

15 body weight (see US 6,107,094). 

The concentration of the iRNA agent composition is an amount sufficient to be 
effective in treating or preventing a disorder or to regulate a physiological condition in 
humans. The concentration or amount of iRNA agent administered will depend on the 
parameters determined for the agent and the method of administration, e.g. nasal, buccal, 

20 pulmonary. For example, nasal formulations tend to require much lower concentrations of 
some mgredients in order to avoid irritation or burning of the nasal passages. It is sometimes 
desirable to dilute an oral formulation up to 10-100 times in order to provide a suitable nasal 
formulation. 

Certain factors may influence the dosage required to effectively treat a subject, 
25 including but not limited to the severity of the disease or disorder, previous treatments, the 
general health and/or age of the subject, and other diseases present. Moreover, treatment of a 
subject with a therapeutically effective amount of an iRNA agent, e.g., a double-stranded 
iRNA agent, or sRNA agent, {e.g., a precursor, e.g., a larger iRNA agent which can be 
processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g.^ a double- 
so stranded iRNA agent, or sRNA agent, or precursor thereof) can include a single treatment 
or, preferably, can include a series of treatments. It will also be appreciated that the effective 
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dosage of a iRNA agent such as a sRNA agent used for treatment may increase or decrease 
over the course of a particular treatment. Changes in dosage may result and become apparent 
from tlie results of diagnostic assays as described herein. For example, the subject can be 
monitored after administering a iRNA agent composition. Based on information from the 
monitoring^ an additional amount of the iRNA agent composition can be administered. 

Dosing is dependent on severity and responsiveness of the disease condition to be 
treated, witli the course of treatment lasting from several days to several months, or until a 
cure is effected or a diminution of disease state is achieved. Optimal dosing schedules can be 
calculated from measurements of drug accumulation in the body of the patient. Persons of 
ordinary skill can easily determine optimum dosages, dosing methodologies and repetition 
rates. Optimum dosages may vary depending on the relative potency of individual 
compounds, and can generally be estimated based on EC50s found to be effective in in vitro 
and in vivo animal models. In some embodiments, the animal models include transgenic 
animals that express a human gene, e.g, a gene that produces a target RNA. The transgenic 
animal can be deficient for the corresponding endogenous RNA. In another embodiment, the 
composition for testing includes a iRNA agent that is complementary, at least in an internal 
region, to a sequence that is conserved between the target RNA in the animal model and the 

/ 

target RNA in a human. 

The inventors have discovered that iRNA agents described herein can be administered 
to manmials, particularly large mammals such as nonhuman primates or humans in a mmiber 
of ways. 

In one embodiment, the administration of the iRNA agent, e.g., a double-stranded 
iRNA agent, or sRNA agent, composition is parenteral, e.g. intravenous (e.g., as a bolus or as 
a difftisible infusion), intradermal, intraperitoneal, intramuscular, intrathecal, intraventricular, 
intracranial, subcutaneous, transmucosal, buccal, sublingual, endoscopic, rectal, oral, vaginal, 
topical, pulmonary, intranasal, urethral or ocular. Administration can be provided by the 
subject or by another person, e.g., a health care provider. The medication can be provided in 
measured doses or in a dispenser which delivers a metered dose. Selected modes of delivery 
are discussed in more detail below. 
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The invention provides methods, compositions, and kits, for rectal administration or 
delivery of iRNA agents described herein. 

Accordingly, an iRNA agent, ^.g., a double-stranded iRNA agent, or sRNA agent, 
(e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent , or a 

5 DNA which encodes a an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agents 
or precursor thereof) described herein, e.g., a therapeutically effective amount of a iRNA 
agent described herein, e.g., a iRNA agent having a double stranded region of less than 40, 
and preferably less than 30 nucleotides and having one or two 1-3 nucleotide single strand 3' 
overhangs can be administered rectally, e.g., introduced through the rectum into the lower or 

10 upper colon. This approach is particularly useful in the treatment of, inflammatory disorders, 
disorders characterized by unwanted cell proliferation, e.g., pol5^s, or colon cancer. 

The medication can be delivered to a site in the colon by introducing a dispensing 
device, e.g., a flexible, camera-guided device similar to that used for inspection of the colon 
or removal of polyps, which includes means for delivery of the medication. 

15 The rectal administration of the iRNA agent is by means of an enema. The iRNA 

agent of the enema can be dissolved in a saline or buffered solution. The rectal 
administration can also by means of a suppository, which can include other ingredients, e.g., 
an excipient, e.g., cocoa butter or hydropropylmethylcellulose. 

Any of the iRNA agents described herein can be administered orally, e.g., in the form 

20 of tablets, capsules, gel capsules, lozenges, troches or liquid syrups. Further, the composition 
can be applied topically to a surface of the oral cavity. 

Any of the iRNA agents described herein can be administered buccally. For example, 
the medication can be sprayed into the buccal cavity or applied directly, e.g., in a liquid, 
solid, or gel form to a surface in the buccal cavity. This administration is particularly 

25 desirable for the treatment of inflammations of the buccal cavity, e.g., the gums or tongue, 
e.g., in one embodiment, the buccal administration is by spraying into the cavity, e.g., 
without inhalation, from a dispenser, e.g., a metered dose spray dispenser that dispenses the 
pharmaceutical composition and a propellant. 

Any of the iRNA agents described herein can be administered to ocular tissue. For 

30 example, the medications can be applied to the surface of the eye or nearby tissue, e.g., the 
inside of tlie eyelid. They can be applied topically, e.g., by spraying, in drops, as an 
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eyewash, or an ointment. Administration can be provided by the subject or by anotlier 
person, e,g.^ a health care provider. The medication can be provided in measured doses or in 
a dispenser which delivers a metered dose. The medication can also be administered to the 
interior of the eye, and can be introduced by a needle or other delivery device which can 
introduce it to a selected area or structure. Ocular treatment is particularly desirable for 
treating inflammation of the eye or nearby tissue. 

Any of the iRNA agents described herein can be administered directly to the skin. 
For example, the medication can be applied topically or delivered in a layer of the skin, e.g.^, 
by the use of a microneedle or a battery of microneedles which penetrate into the skin, but 
preferably not into the underlying muscle tissue. Administration of the iRNA agent 
composition can be topical. Topical applications can, for example, deliver the composition 
to the dermis or epidermis of a subject. Topical administration can be in the form of 
transdermal patches, ointments, lotions, creams, gels, drops, suppositories, sprays, liquids or 
powders. A composition for topical administration can be formulated as a liposome, micelle, 
emulsion, or other lipophilic molecular assembly. The transdermal administration can be 
applied with at least one penetration enhancer, such as iontophoresis, phonophoresis, and 
sonophoresis. 

Any of the iRNA agents described herein can be administered to the pulmonary 
system. Pulmonary administration can be achieved by inhalation or by the mtroduction of a 
delivery device into the pulmonary system, e.g., by introducing a delivery device which can 
dispense the medication. A preferred method of pulmonary delivery is by inhalation. The 
medication can be provided in a dispenser which delivers the medication, e.g., wet or dry, in 
a form sufficiently small such that it can be inhaled. The device can deliver a metered dose 
of medication. The subject, or another person, can administer the medication. 

Pulmonary delivery is effective not only for disorders which directly affect 
pulmonary tissue, but also for disorders which affect other tissue. 

iRNA agents can be formulated as a liquid or nonliquid, e.g., a powder, crystal, or 
aerosol for pulmonary delivery. 

Any of the iRNA agents described herein can be administered nasally. Nasal 
administration can be achieved by introduction of a delivery device into the nose, e.g., by 
introducing a delivery device which can dispense tlie medication. Methods of nasal delivery 
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include spray, aerosol, liqmd, e.g*., by drops, or by topical administration to a siirface of the 
nasal cavity. The medication can be provided in a dispenser with delivery of the medication, 
e,g., wet or dry, in a form sufficiently small such that it can be inhaled. The device can 
deliver a metered dose of medication. The subject, or another person, can administer the 

5 medication. 

Nasal delivery is effective not only for disorders which directly affect nasal tissue, but 
also for disorders which affect other tissue 

iRNA agents can be formulated as a Uquid or nonliquid, e.g,^ a powder, crystal, or for 
nasal delivery. 

10 An iRNA agent can be packaged in a viral natural capsid or in a chemically or 

enzymatically produced artificial capsid or structure derived therefrom. 

The dosage of a pharmaceutical composition including a iRNA agent can be 
administered in order to alleviate the symptoms of a disease state, e.g*., cancer or a 
cardiovascular disease. A subject can be treated with the pharmaceutical composition by any 

15 of the methods mentioned above. 

Gene expression in a subject can be modulated by administering a pharmaceutical 
composition including an iRNA agent, 

A subject can be treated by administermg a defined amoimt of an iRNA agent, e,g,, a 
double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent 

20 which can be processed into a sRNA agent) composition that is in a powdered form, e.g,, sl 
collection of microparticles, such as crystalline particles. The composition can include a 
plurality of iRNA agents, e.g*., specific for one or more different endogenous target RNAs. 
The method can include other features described herein. 

A subject can be treated by administering a defined amount of an iRNA agent 

25 ' composition that is prepared by a method that includes spray-drying, L e, atomizing a liquid 
solution, emulsion, or suspension, immediately exposing the droplets to a drying gas, and 
collecting the resulting porous powder particles. The composition can include a plurality of 
iRNA agents, e.g., specific for one or more different endogenous target RNAs. The method 
can include other features described herein. 

30 The iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a 

precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a DNA 
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which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof), can be provided in a powdered, crystallized or other finely divided form, 
with or without a carrier, e.g., sl micro- or nano-particle suitable for inhalation or other 
pulmonary delivery. This can include providing an aerosol preparation, e.g., an aerosolized 
5 spray-dried composition. The aerosol composition can be provided in and/or dispensed by a 
metered dose delivery device. 

The subject can be treated for a condition treatable by inhalation, e.g., by aerosolizing 
a spray-dried iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a 
precursor, e.g., sl larger iRNA agent which can be processed into a sRNA agent, or a DNA 
10 which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 

precursor thereof) composition aud inhaling the aerosolized composition. The iRNA agent 
can be an sRNA. The composition can include a plurality of iRNA agents, e.g., specific for 
one or more different endogenous target RNAs. The method can include other features 
described herein. 

15 A subject can be treated by, for example, administering a composition including an 

effective/defined amount of an iRNA agent, e.g.,si double-stranded iRNA agent, or sRNA 
agent, (e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA 
agent, or a DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or 
sRNA agent, or precursor thereof), wherem the composition is prepared by a method that 

20 includes spray-drying, lyophilization, vacuum drying, evaporation, fluid bed drying, or a 
combination of these techniques 

In another aspect, the invention features a method that includes: evaluating a 
parameter related to the abundance of a transcript in a cell of a subject; comparing the 
evaluated parameter to a reference value; and if the evaluated parameter has a preselected 

25 relationship to the reference value (e.g., it is greater), administering a iRNA agent (or a 

precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a DNA 
which encodes a iRNA agent or precursor thereof) to the subject. In one embodiment, the 
iRNA agent includes a sequence that is complementary to the evaluated transcript. For 
example, the parameter can be a direct measure of transcript levels, a measure of a protein 

30 level, a disease or disorder symptom or cliaracterization (e.g., rate of cell proliferation and/or 
tumor mass, viral load,) 
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In another aspect, the invention features a method that includes: administering a first 
amount of a composition that comprises an iRNA agent, e.g., a double-stranded iRNA agent, 
or sRNA agent, (e.g., sl precursor, e.g,, a larger iRNA agent which can be processed into a 
sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, 
5 or sRNA agent, or precursor thereof) to a subj ect^ wherein the iRNA agent includes a strand 
substantially complementary to a target nucleic acid; evaluating an activity associated with a 
protein encoded by the target nucleic acid; wherein the evaluation is used to determine if a 
second amount should be administered. In a preferred embodiment the method includes 
administering a second amount of the composition, wherein the timing of administration or 

10 dosage of the second amount is a function of the evaluating. The method can include other 
features described herein. 

In another aspect, the invention features a method of administering a source of a 
double-stranded iRNA agent (ds iRNA agent) to a subject. The method includes 
administering or implanting a source of a ds iRNA agent, e.g,, a sRNA agent, that (a) 

15 includes a double-stranded region that is 19-25 nucleotides long, preferably 21-23 

nucleotides, (b) is complementary to a target RNA (e.g., an endogenous RNA or a pathogen 
RNA), and, optionally, (c) includes at least one 3' overhang 1-5 nt long. In one embodiment, 
the source releases ds iRNA agent over time, e,g. the source is a controlled or a slow release 
source, e.g.^ a microparticle that gradually releases the ds iRNA agent. In another 

20 embodiment, the source is a pump, e.g., a pump that includes a sensor or a pump that can 
release one or more unit doses. 

In one aspect, the invention features a pharmaceutical composition that includes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
lEirger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 

25 iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) 

including a nucleotide sequence complementEiry to a target RNA, e.g., substantially and/or 
exactly complementary. The target RNA can be a transcript of an endogenous human gene. 
In one embodiment, the iRNA agent (a) is 19-25 nucleotides long, preferably 21-23 
nucleotides, (b) is complementary to an endogenous target RNA, and, optionally, (c) includes 

30 at least one 3' overhang 1-5 nt long. In one embodiment, the pharmaceutical composition can 
be an emulsion, microemulsion, cream, jelly, or liposome. 
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In one example the pharmaceutical composition includes an iRNA agent mixed with a 
topical delivery agent. The topical delivery agent can be a plurality of microscopic vesicles. 
The microscopic vesicles can be liposomes. In a preferred embodiment the liposomes are 
cationic liposomes. 

In another aspect, the pharmaceutical composition includes an iRNA agent, e.g., a, 
double-stranded iRNA agent, or sRNA agent (e.g., a precursor, e.g., a larger iRNA agent 
which can be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e,g,, a 
double-stranded iRNA agent, or sRNA agent, or precursor thereof) admixed with a topical 
penetration enhancer. In one embodiment, the topical peneti-ation enhancer is a fatty acid. 
The fatty acid can be arachidonic acid, oleic acid, lauric acid, caprylic acid, capric acid, 
myristic acid, pahnitic acid, stearic acid, linoleic acid, linolenic acid, dicaprate, tricaprate, 
monolein, dilaurin, glyceryl 1-monocaprate, l-dodecylazacycloheptan-2-one, an 
acylcarnitine, an acylcholine, or a Clio alkyl ester, monoglyceride, diglyceride or 
pharmaceutically acceptable salt thereof. 

In another embodiment, the topical penetration enhancer is a bile salt. The bile salt 
can be cholic acid, dehydrocholic acid, deoxycholic acid, glucholic acid, glycholic acid, 
glycodeoxychohc acid, taurocholic acid, taurodeoxychoUc acid, chenodeoxycholic acid, 
ursodeoxycholic acid, sodium tauro-24,25-dihydro-fusidate, sodium glycodihydrofusidate, 
polyoxyethylene-9-lauryl ether or a pharmaceutically acceptable salt tliereof 

In another embodiment, the penetration enhancer is a chelating agent. The chelating 
agent can be EDTA, citric acid, a sahcyclate, aN-acyl derivative of collagen, laureth-9, an 
N-amino acyl derivative of a beta-diketone or a mixture thereof 

In another embodiment, the penetration enhancer is a surfactant, e.g., an ionic or 
nonionic suifactant. The surfactant can be sodium lauryl sulfate, polyoxyethyiene-9-lauryl 
ether, polyoxyethylene-20-cetyl etlier, a perfluorchemical emulsion or mixtm-e thereof. 

In another embodhnent, the penetration enhancer can be selected from a group 
consisting of unsaturated cyclic ureas, 1 -alkyl-alkones, 1-alkenylazacyclo-alakanones, 
steroidal anti-inflamriaatory agents and mixtures thereof In yet another embodiment the 
penetration enhancer can be a glycol, a pyrrol, an azone, or a terpenes. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
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larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) in a 
form suitable for oral delivery. In one embodiment, oral delivery can be used to deliver an 
iRNA agent composition to a cell or a region of the gastro-intestinal tract, e,g,, small 
intestine^ colon (e,g,, to treat a colon cancer), and so forth. The oral delivery form can be 
tablets, capsules or gel capsules, hi one embodiment, the iRNA agent of the phaiinaceutical 
composition modulates expression of a cellular adhesion protein, modulates a rate of cellular 
proliferation, or has biological activity against eukaryotic pathogens or retroviruses, hi 
another embodiment, the pharmaceutical composition includes an enteric material that 
substantially prevents dissolution of the tablets, capsules or gel capsules in a mammalian 
stomach. In a preferred embodiment the enteric material is a coating. The coating can be 
acetate phtlialate, propylene glycol, sorbitan monoleate, cellulose acetate trimellitate, 
hydroxy propyl methylcellulose phthalate or cellulose acetate phthalate. 

In another embodiment, the oral dosage form of the pharmaceutical composition 
includes a penetration enhancer. The penetration enhancer can be a bile salt or a fatty acid. 
The bile salt can be ursodeoxycholic acid, chenodeoxycholic acid, and salts thereof. The 
fatty acid can be capric acid, lauric acid, and salts thereof. 

In another embodiment, the oral dosage form of the pharmaceutical composition 
includes an excipient. In one example the excipient is polyethyleneglycoL In another 
example the excipient is precirol. 

In another embodiment, the oral dosage form of the pharmaceutical composition 
includes a plasticizer. The plasticizer can be diethyl phthalate, triacetin dibutyl sebacate, 
dibutyl phthalate or triethyl citrate. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent and a delivery vehicle. In one embodiment, the iRNA agent is (a) is 19-25 
nucleotides long, preferably 21-23 nucleotides, (b) is complementary to an endogenous target 
RNA, and, optionally, (c) includes at least one 3' overhang 1-5 nucleotides long. 

In one embodiment, the delivery vehicle can deliver an iRNA agent, e.g,^ a double- 
sti'anded iRNA agent, or sRNA agent, ie,g., a precursor, e.g., a larger iRNA agent which can 
be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, or precursor thereof) to a cell by a topical route of 
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administration. The delivery vehicle can be microscopic vesicles. In one example the 
microscopic vesicles are liposomes. In a preferred embodiment the liposomes are cationic 
liposomes. In another example the microscopic vesicles are micelles.In one aspect, the 
invention features a pharmaceutical composition including an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can 
be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, or precursor thereof) in an injectable dosage form. In 
one embodiment, the injectable dosage form of the pharmaceutical composition includes 
sterile aqueous solutions or dispersions and sterile powders. In a preferred embodiment the 
sterile solution can include a diluent such as water; saline solution; fixed oils, polyethylene 
glycols, glycerin, or propylene glycol. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) in 
oral dosage form. In one embodiment, the oral dosage form is selected from the group 
consisting of tablets, capsules and gel capsules. In another embodiment, the pharmaceutical 
composition includes an enteric material that substantially prevents dissolution of the tablets, 
capsules or gel capsules in a mammalian stomach. In a preferred embodiment the enteric 
material is a coating. The coating can be acetate phthalate, propylene glycol, sorbitan 
monoleate, cellulose acetate trimellitate, hydroxy propyl methyl cellulose phthalate or 
cellulose acetate phthalate. In one embodiment, the oral dosage form of the pharmaceutical 
composition includes a penetration enhancer, e.g., a penetration enhancer described herein. 

In another embodiment, the oral dosage form of the pharmaceutical composition 
includes an excipient. In one example the excipient is polyethyleneglycol. In another 
example the excipient is precirol. 

In another embodiment, the oral dosage form of the pharmaceutical composition 
includes a plasticizer. The plasticizer can be diethyl phthalate, triacetin dibutyl sebacate, 
dibutyl phthalate or triethyl citrate. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
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larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) in a 
rectal dosage form. In one embodiment, the rectal dosage form is an enema. In another 
embodiment, the rectal dosage form is a suppository. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent, e,g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) in a 
vaginal dosage form. In one embodiment, the vaginal dosage form is a suppository. In 
another embodiment, the vaginal dosage form is a foam, cream, or gel. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent, e,g,, sl double-sti-anded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) in a 
pulmonary or nasal dosage form. In one embodiment, the iRNA agent is incorporated into a 
particle, e.g., a macroparticle, e.g., a microsphere. The particle can be produced by spray 
drying, lyophilization, evaporation, fluid bed drying, vacuum drying, or a combination 
thereof The microsphere can be formulated as a suspension, a powder, or an implantable 
solid. 

In one aspect, the invention features a spray-dried iRNA agent, e.g., a, double- 
stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can 
be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, or precursor thereof) composition suitable for 
inhalation by a subject, including: (a) a therapeutically effective amount of a iRNA agent 
suitable for treating a condition in the subject by inhalation; (b) a pharmaceutically 
acceptable excipient selected from the group consisting of carbohydrates and amino acids; 
and (c) optionally, a dispersibility-enhancing amount of a physiologically-acceptable, water- 
soluble polypeptide. 

In one embodiment, the excipient is a carbohydrate, The carbohydrate can be 
selected from the group consisting of monosaccharides, disaccharides, trisaccharides, and 
polysaccharides. In a preferred embodiment the carbohydrate is a monosaccharide selected 
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from the group consisting of dextrose, galactose, mannitol, D-mannose, sorbitol, and sorbose. 
In another preferred embodiment the carbohydrate is a disaccharide selected from the group 
consisting of lactose, maltose, sucrose, and trehalose. 

In another embodiment, the excipient is an amino acid. In one embodiment, the 
5 amino acid is a hydrophobic amino acid. In a preferred embodiment the hydrophobic amino 
acid is selected from the group consisting of alanine, isoleucine, leucine, methionine, 
phenylalanine, proline, tryptophan, and valine. In yet another embodiment the amino acid is a 
polar amino acid. In a preferred embodiment the amino acid is selected from the group 
consisting of arginine, histidine, lysine, cysteine, glycine, glutamine, serine, threonine, 
10 tyrosine, aspartic acid and glutamic acid. 

In one embodiment, the dispersibility-enhancing polypeptide is selected from the 
group consisting of human serum albumin, a-lactalbmnin, trypsinogen, and polyalanine. 

In one embodiment, the spray-dried iRNA agent composition includes particles 
having a mass median diameter (MMD) of less than 10 microns. In another embodiment, 
15 the spray-dried iRNA agent composition includes particles having a mass median diameter of 
less than 5 microns. In yet another embodiment the spray-dried iRNA agent composition 
includes particles having a mass median aerodynamic diameter (MMAD) of less than 5 
microns. 

In certain other aspects, the invention provides kits that include a sxiitable container 
20 containing a pharmaceutical formulation of an iRNA agent, e.g., a double-stranded iRNA 
agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can be processed 
into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double-stranded iRNA 
agent, or sRNA agent, or precxirsor thereof). In certain embodiments the individual 
components of the pharmaceutical formulation may be provided in one container. 
25 Alternatively, it may be desirable to provide the components of the pharmaceutical 

formulation separately in two or more containers, e.g., one container for an iRNA agent 
preparation, and at least another for a carrier compound. The kit may be packaged in a 
number of different configurations such as one or more containers in a single box. The 
different components can be combined, e.g., according to instructions provided with the kit. 
30 The components can be combined according to a method described herein, e.g., to prepare 
and administer a pharmaceutical composition. The kit can also include a delivery device. 
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In another aspect, the invention features a device, e.g,^ an implantable device, wherein 
the device can dispense or administer a composition that includes an iRNA agent, e.g-., a 
double-stranded iRNA agent, or sRNA agent, {e,g., a precursor, e,g,, a larger iRNA agent 
which can be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a 
double-stranded iRNA agent, or sRNA agent, or preciirsor thereof), e.g., a iRNA agent that 
silences an endogenous transcript. In one embodiment, the device is coated with the 
composition. In another embodiment the iRNA agent is disposed within the device. In 
another embodiment, the device includes a mechanism to dispense a unit dose of the 
composition. In other embodiments the device releases the composition continuously, e.g., 
by diffusion. Exemplary devices include stents, catheters, pumps, artificial organs or organ 
components (e.g., artificial heart, a heart valve, etc.), and sutures. 

As used herein, the temi "crystalline" describes a soUd having the structure or 
characteristics of a crystal, z.e., particles of three-dimensional structure in which the plane 
faces intersect at definite angles and in which there is a regular internal structure. The 
compositions of the invention may have different crystaUine forms. Crystalline forms can be 
prepared by a variety of methods, including, for example, spray drying. 

The invention is further illustrated by the following examples, which should not be 
construed as further limiting. 

EXAMPLES 

Example 1 : Inhibition of endogenous ApoM gene expression in mice 

Apolipoprotein M (ApoM) is a human apolipoprotein predominantly present in high- 
density lipoprotein (HDL) in plasma. ApoM is reported to be expressed exclusively in hver 
and in kidney (Xu N et a/., Biochem J Biol Chem 1999 Oct 29;274(44);3 1286-90). Mouse 
ApoM is a 21kD membrane associated protein, and, in serum, the protein is associated with 
HDL particles. ApoM gene expression is regulated by the transcription factor hepatocyte 
nuclear- factor 1 alpha (Hnf-la), as Hnf-la"^" mice are ApoM deficient. In humans, mutations 
in the HNF-1 alpha gene represent a common cause of maturity-onset diabetes of the young 
(MODY). 
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A variety of test iRNAs were synthesized to target the mouse ApoM gene. This gene 
was chosen in part because of its high expression levels and exclusive activity in the liver and 
kidney. 

Three different classes of dsRNA agents were sjaithesized, each class having different 
5 modifications and features at the 5' and 3' ends^ see Table 4. 
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Table 4 

Targeted ORF's 

r 

5 The23raer: AAGTTTGGGCAGCTCTGCTCT (SEQ ID NOs6708) 

19 TIie23mer: AAGTGGACATACCGATTGACT (SEQ ID i)JOs6709) 

25 The23raer: AACTCAGAACTGAAGGGCGCC (SEQ ID NO: 6710) 

27 The23mer: AAGGGCGCCCAGACATGAAAA (SEQ ID NO: 6711) 
3'-0TR (beginning at 645) 

42: AAGATAGGAGCCCAGCTTCGA (SEQ ID NO: 6712) 



Class I 

21-nt iRNAs, t, deoxythymidine; p, phosphate 

pGUUUGGGCAGCUCUGCUCUtt (SEQ ID NO: 6712) #1 
pAGAGCAGAGCUGCCCAAACtt (SEQ ID NO: 6713) 

pGUGGAGAUACCGAUUGACUtt (SEQ ID NO: 6714) #2 
pAGUCAAUCGGUAUGUCCACtt (SEQ ID NO: 6715) 

pCUCAGAACUGAAGGGCGCCtt (SEQ ID NO: 6716) #3 
pGGCGCCCUUCAGUUCUGAGtt (SEQ ID NO: 6717) 

pGAUAGGAGCCCAGCUUCGAtt (SEQ ID NO: 6718) #4 
pUCGAAGCUGGGCUCCUAUCtt (SEQ ID NO: 6719) 
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Class II 

21 -nt iRNAs, t, deoxythymidine; p, phosphate; ps, thiophosphate 

pGUUUGGGCAGGUCUGGUCpsUpstpst (SEQ ID 3<IOs67 20) #11 
pAGAGCAGAGCUGCCCAAApsCpstpst (SEQ ID HO: 6721) 

pGUGGACAUACCGAUUGACpsUpstpst (SKQ ID NO:6722) #13 
pAGUCAAUCGGUAUGUCCApsCpstpst (SEQ ID 1^0:6723) 

pCUCAGAACUGAAGGGCGCpsCpstpst (SEQ ID NO: 6724) #15 
pGGCGCCCUUCAGUUCUGApsGpstpst (SEQ ID NO: 672 5) 

pGAXIAGGAGCCCAGCUUCGpsApstpst (SEQ ID N05 6726) #17 
pUCGAAGCUGGGCUCCUAUpsCpstpst (SEQ ID NO s 6727) 

Class III 

23 -nt antisense, 21-nt sense, blunt-ended 5 -as 

GUUUGGGCAGCUCUGCUCUCU (SEQ ID NO:6728) #19 
AGAGAGCAGAGCUGCCCAAACUU (SEQ ID NO: 672 9) 

GUGGACAUACCGAUUGACUGA (SEQ ID NO:6730) #21 
UCAGUCAAUCGGUAUGUCCACUU (SEQ ID NO: 6731) 

CUCAGAACUGAAGGGCGCCCA (SEQ ID NO: 673 2) #2 3 
PUGGGCGCCCUUCAGUUCUGAGUU (SEQ ID NO: 67 33) 

GAUAGGAGCCCAGCUUCGAGU (SEQ ID NO:6734) #25 
ACUCGAAGCUGGGCUCCUAUCUU (SEQ ID NO: 67 35) 



Class I dsRNAs consisted of 21 nucleotide paired sense and antisense strands. The 
sense and antisense strands were each phosphorylated at their 5' ends. The double stranded 
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region was 19 nucleotides long and consisted of ribonucleotides. The 3' end of each strand 
created a two nucleotide overhang consisting of two deoxyribonucleotide tliymidines. See 
constructs #1-4 in Table 4. 

Class II dsRNAs were also 21 nucleotides long, with a 19 nucleotide double strand 
5 region. The sense and antisense strands were each phosphorylated at their 5' ends. The three 
3' terminal nucleotides of the sense and antisense strands were phosphorothioate 
deoxyribonucleotides, and the two terminal phosphorothioate thymidines were unpaired, 
creating a 3' overhang region at each end of the iRNA molecule. See constructs 1 1, 13, 15, 
and 17 in Table 4. 

10 Class III dsRNAs included a 23 ribonucleotide antisense strand and a 

21 ribonucleotide sense strand, to form a construct having a blunt 5 'and a 3' overhang region. 
See constructs 19, 21, 23, and 25 in Table 4. 

Within each of the three classes of iRNAs, the four dsRNA molecules were designed 
to target four different regions of the ApoM transcript. dsRNAs 1, 1 1, and 19 targeted the 5' 

15 end of the open reading frame (ORF). dsRNAs 2, 13, and 21, and 3, 15, and 23, targeted two 
internal regions (one 5' proximal and one 3' proximal) of the ORF, and the 4, 17, and 25 
iRNA constructs targeted to a region of the 3' untranslated sequence (3' UTS) of the ApoM 
mRNA. This is summarized in Table 5. 

20 Table 5. iRNA molecules targeted to mouse ApoM 





iRNA targeted 
to 5 ' end of 
ORF 


iRNA targeted 
to middle ORF 
(5' proximal) 


iRNA targeted 
to middle ORF 
(3' proximal) 


iRNA targeted 
to 3 'UTS 


Class I 


1 


2 


3 


4 


Class II 


11 


13 


15 


17 


Class ni 


19 


21 


23 


25 



CDl mice (6-8 weeks old, ~35g) were administered one of the test iRNAs in PBS 
solution. Two hundred micrograms of iRNA in a volume of solution equal to 10% body 
weight (~5.7mg iRNA/kg mouse) was administered by the method of high pressure tail vein 
25 injection, over a 1 0-20 sec. time interval. After a 24h recovery period, a second injection 
was perfomied using the same dose and mode of administration as the first injection, and 
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following another 24h, a third and final injection was administered, also using the same dose 
and mode of administration. After a final 24h recovery, the mouse was sacrificed, serum was 
collected and the liver and kidney harvested to assay for an affect on ApoM gene expression. 
Expression was monitored by quantitative RT-PCR and Western blot analyses. This 
experiment was repeated for each of the iRNAs listed in table 4. 

Class I iRNAs did not alter ApoM RNA levels in mice, as indicated by quantitative 
RT-PCR. This is in contrast to the effect of these iRNAs in cultured HepG2 cells. Cells 
cotransfected with a plasmid expressing exogenous ApoM RNA under a CMV promoter and 
a class I iRNA demonstrated a 25% or greater reduction in ApoM RNA concentrations as 
compared to control transfections. The iRNA molecules 1, 2 and 3 each caused a 75% 
decrease in exogenous ApoM mRNA levels. 

Class II iRNAs reduced liver and kidney ApoM mRNA levels by -30-85%. The iRNA 
molecule "13" elicited the most dramatic reduction in mRNA levels; quantitative RT-PCR 
indicated a decrease of about 85% in liver tissue. Serum ApoM protein levels were also 
reduced as was evidenced by Western blot analysis. The iRNAs 1 1, 13 and 1 5, reduced 
protein levels by about 50%, while iRNA 17 had the mildest effect, reducing levels only by 
~1 5-20%. 

Class m iRNAs (constructs 19, 21, and 23) reduced serum Apo levels by -40-50%. 
To determine the effect of dosage on iRNA mediated ApoM inhibition, the 
experiment described above was repeated with three injections of 50|j,g iRNA "11" 
(-1 .4mg iRNA/kg mouse). This lower dosage of iRNA resulted in a reduction of serum 
ApoM levels of about 50%. This is compared with the reduction seen with the 200[ig 
injections, which reduced serum levels by 25-45%. These results mdicated the lower 
dosage amoimts of iRNAs were effective. 

In an effort to increase iRNA uptake by cells, iRNAs were precomplexed with 
Upofectamine prior to tail vein injections. ApoM protein levels were about 50% of wildtype 
levels in mice injected with iRNA "11" when the molecules were preincubated with 
Upofectamine; ApoM levels were also about 50% of wildtype when mice were injected with 
iRNA "11" that was not precomplexed with Upofectamine. 
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These experiments revealed that modified iRNAs can greatly influence RNAi- 
mediated gene silencing. As demonstrated herein, modifications including phosphorothioate 
nucleotides are particularly effective at decreasing target protein levels. 

E?iample 2; apoB protein as a therapeutic target for lipid-based digeages 

Apolipoprotein B (apoB) is a candidate target gene for the development of novel 

therapies for lipid-based diseases. 

Methods described herein can be used to evaluate the efficacy of a particular siRNA 

as a therapeutic tool for treating lipid metabolism disorders resulting elevated apoB levels. 

Use of siRNA duplexes to selectively bind and inactivate the target apoB mRNA is an 

approach totreat these disorders. 
Two approaches: 

i) Inhibition of apoB in ex-vivo models by transfecting siRNA duplexes homologous 
to human apoB mRNA in a hviman hepatoma cell Ime (Hep G2) and monitor the level of the 
protein and the RNA using the Westem blotting and RT-PCR methods, respectively. siRNA 
molecules that efficiently inhibit apoB expression will be tested for similar effects in vivo. 

ii) In vivo trials using an apoB transgenic mouse model (apoB 1 00 Transgenic Mice, 
C57BL/6NTac-TgN (APOBIOO), Order Model #'s:1004-T (hemizygotes), B6 (control)). 
siRNA duplexes ai'e designed to target apoB-100 or CETP/apoB double transgenic mice 
which express both cholesteryl ester transfer protein (CETP) and apoB. The effect of the 
siRNA on gene expression in vivo can be measured by monitoring the HDL/LDL cholesterol 
level m serum. The results of these experiments would indicate the therapeutic potential of 
siRNAs to treat lipid-based diseases, including hypercholesterolemia, HDL/LDL cholesterol 
imbalance, familial combined hyperlipidemia, and acquired hyperlipidemia. 

Background Fats, in the form of triglycerides, are ideal for energy storage because they are 
highly reduced and anhydrous. An adipocyte (or fat cell) consists of a nucleus, a cell 
membrane, and triglycerides, and its function is to store triglycerides. 

The lipid portion of the human diet consists largely of triglycerides and cholesterol 
(and its esters). These must be emulsified and digested to be absorbed. Specifically, fats 
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(triacylglycerols) are ingested. Bile (bile acids, salts, and cholesterol), which is made in the 
liver, is secreted by the gall bladder. Pancreatic lipase digests the triglycerides to fatty acids, 
and also digests di-, and mono-acylglycerols, which are absorbed by intestinal epithelial cells 
and then are resynthesized into triacylglycerols once inside the cells. These triglycerides and 
5 some cholesterols are combined with apolipoproteins to produce chylomicrons. 

Chylomicrons consist of approximately 95% triglycerides. The chylomicrons transport fatty 
acids to peripheral tissues. Any excess fat is stored in adipose tissue. 

Lipid transport and cleai^ance from the blood into cells, and from the cells into the 
blood and the liver, is mediated by the lipoprotein transport proteins. This class of 

10 approximately 17 proteins can be divided into three groups: Apolipoproteins, lipoprotein 
processing proteins, and lipoprotein receptors. 

Apolipoproteins coat lipoprotein particles, and include the A-I, A-II, A-IV, B, CI, 
CII, cm, D, E, Apo(a) proteins. Lipoprotein processing proteins include lipoprotein lipase, 
hepatic lipase, lecithin cholesterol acyltransferase and cholesterol ester transfer protein. 

15 Lipoprotem receptors include the low density lipoprotein (LDL) receptor, chylomicron- 

remnant receptor (the LDL receptor like protein or LDL receptor related protein - LRP) and 
the scavenger receptor. 

Lipoprotein Metabolism Since the triglycerides, cholesterol esters, and cholesterol absorbed 
20 into the small intestine are not soluble in aqueous medium, they must be combined with 

suitable proteins (apolipoproteins) in order to prevent them from forming large oil droplets. 

I, 

The resulting lipoproteins undergo a type of metabolism as they pass tlirough the 

bloodstream and certain organs (notably the liver). 

Also synthesized in the liver is high density lipoprotein (HDL), which contains the 
25 apoproteins A-1, C-1, and D; HDL collects cholesterol from peripheral tissues and 

blood vessels and returns it to the liver. LDL is taken up by specific cell surface receptors 

into an endosome, which fuses with a lysosome where cholesterol ester is converted to free 

cholesterol. The apoproteins (including apo B-lOO) are digested to amino acids. The 

receptor protein is recycled to the cell membrane. 
30 The free cholesterol formed by this process has two fates. First, it can move to the 

endoplasmic reticulum (ER), where it can inhibit HMG-CoA reductase, the synthesis of 
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HMG-CoA reductase, and the synthesis of cell surface receptors for LDL, Also in the ER, 
cholesterol can speed up the degradation of HMG-CoA reductase. The free cholesterol can 
also be converted by acyl-CoA and acyl transferase (ACAT) to cholesterol esters, which 
form oil droplets. 

ApoB is the major apolipoprotein of chylomicrons of very low density lipoproteins 
(VLDL, which carry most of the plasma triglyceride) and low density lipoprotein (LDL, 
which carry most of the plasma cholesterol). ApoB exists in human plasma in two isoforms, 
apoB-48 and apoB-100. 

ApoB- 100 is the major physiological ligand for the LDL receptor. The ApoB 
precursor has 4563 amino acids, and the mature apoB-100 has 4536 amino acid residues. The 
LDL-binding domain of ApoB-100 is proposed to be located between residues 3129 and 
3532. ApoB-100 is synthesized in the liver and is required for the assembly of very low 
density lipoproteins VLDL and for the preparation of apoB-100 to transport triglycerides 
(TG) and cholesterol from the liver to other tissues. ApoB-100 does not interchange between 
lipoprotein particles, as do the other lipoproteins, and it is fomid in IDL and LDL particles. 
After the removal of apolipoproteins A, E and C, apoB is incorporation into VLDL by 
hepatocytes. ApoB -4 8 is present in chylomicrons and plays an essential role in the intestinal 
absorption of dietary fats. ApoB-48 is synthesized in the small intestine. It comprises the N- 
terminal 48% of apoB-100 and is produced by a posttranscriptional apoB-100 mRNA editing 
event at codon 2153 (C to U). This editing event is a product of the apoBEC-lb enzyme, 
which is expressed in the intestine. This editing event creates a stop codon instead of a 
glutamine codon, and therefore apoB-48, instead of apoB-100 is expressed in tlie intestine 
(apoB-100 is expressed in the liver). 

There is also strong evidence that plasma apoB levels may be a better index of the 
risk of coronary artery disease (CAD) tlian total or LDL cholesterol levels. Clinical studies 
have demonstrated the value of measuring apoB in hypertriglyceridemic, 
hypercholesterolemic and normalipidemic subjects. 
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Table 6. Reference Range Lipid level in the Blood 



Lipid 


Range (mmols/ L) 


Plasma Cholesterol 


3.5-6.5 


Low density lipoprotein 


1.55-4.4 


Very low density lipoprotein 


0.128-0.645 


High density lipoprotein/ triglycerides 


0.5-2.1 


Total lipid 


4.0-lOg / L 



Molecular genetics of lipid metabolism in both humans and induced mutant mouse models 
5 Elevated plasma levels of LDL and apoB are associated with a higher risk for atherosclerosis 
and coronary heart disease, a leading cause of mortality. ApoB is the mandatory constituent 
of LDL particles. In addition to its role in lipoprotein metabolism, apoB has also been 
implicated as a factor in male infertility and fetal development. Furthermore, two 
quantitative trait loci regulating plasma apoB levels have been discovered, through the use of 

10 transgenic mouse models. Future experiments will facilitate the identification of human 
orthologous genes encoding regulators of plasma apoB levels. These loci are candidate 
therapeutic targets for human disorders characterized by altered plasma apoB levels. Such 
disorders include non-apoB linked hypobetalipoproteinemia and familial combined 
hyperlipidemia. The identification of these genetic loci would also reveal possible new 

15 pathways involved in the regulation of apoB secretion, potentially providing novel sites for 
pharmacological therapy. 

Diseases and Clinical Pharmacology Familial combined hyperlipemia (FCHL) affects an 
estimated one in 10 Americans. FCHL can cause premature heart disease. 

20 Familial Hypercholesterolemia (Jiigh level of apo B) A common genetic disorder of lipid 
metabolism. Familial hypercholesterolemia is characterized by elevated sermn TC in 
association with xanthelasma, tendon and tuberous xanthomas, accelerated atherosclerosis, 
and early death fi-om myocardial infarction (MI). It is caused by absent or defective LDL 
cell receptors, resulting in delayed LDL clearance, an increase in plasma LDL levels, and an 

25 accxamulation of LDL cholesterol in macrophages over joints and pressure points, and in 
blood vessels. 
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Atherosclerosis (high level of apo B) Atherosclerosis develops as a deposition of cholesterol 
and fat in the arterial wall due to disturbances in lipid transport and clearance from the blood 
into cells and from the cells to blood and the liver. 

Clinical studies have demonstrated that elevation of total cholesterol (TC)^ low- 
density lipoprotein cholesterol (LDL-C) and apoB-100 promote human atlierosclerosis. 
Similarly, decreased levels of high - density lipoprotein cholesterol (HDL-C) are associated 
with the development of atherosclerosis. 

ApoB may be factor in the genetic cause of high cholesterol. 

The risk of coronary artery disease (CAD) (high level of apo B) Cardiovascular disease, 

I 

including coronary heart disease and stroke, is a leading cause of death and disability. The 
major risk factors include age, gender, elevated low-density lipoprotein cholesterol blood 
levels, decreased high-density lipoprotein cholesterol levels, cigarette smoking, hypertension, 
and diabetes. Emerging risk factors include elevated lipoprotein (a), remnant lipoproteins, 
and C reactive protein. Dietary intake, physical activity and genetics also impact 
cardiovascular risk. Hypertension and age are the major risk factors for stroke. 

Abetalipoproteinemia, an inherited human disease characterized by a near-complete 
absence of apoB-containing lipoproteins in the plasma, is caused by mutations in the gene for 
microsomal triglyceride transfer protein (MTP). 

Model for human atherosclerosis (Lipoprotein A transgenic mouse) Numerous studies have 
demonstrated that an elevated plasma level of lipoprotein(a) (Lp(a)) is a major independent 
risk factor for coronary heart disease (CHD). Current therapies, however, have little or no 
effect on apo(a) levels and the homology between apo(a) and plasminogen presents barriers 
to drug development. Lp(a) particles consist of apo(a) and apoB-100 proteins, and they are 
found only in primates and the hedgehog. The development of LPA transgenic mouse 
requires the creation of animals that express both human apoB and apo(a) transgenes to 
achieve assembly of LP (a). An atherosclerosis mouse model would facilitate the study of 
the disease process and factors influencing it, and further would facilitate the development of 
therapeutic or preventive agents. There are several strategies for gene-oriented therapy. For 
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example, the missing or non-functional gene can be replaced, or unwanted gene activity can 
be inhibited. 

Model for lipid Metabolism and Atherosclerosis DNX Transgenic Sciences has 
demonstrated that both CETP/ApoB and ApoB transgenic mice develop atherosclerotic 
plaques. 

Model for apoB-lOO overexpression The apoB-100 transgenic mice express high levels of 
human apoB-100. They consequently demonstrate elevated serum levels of LDL cholesterol. 
After 6 months on a high-fat diet, the mice develop significant foam cell accumulation under 
the endothelium and within the media, as well as cholesterol crystals and fibrotic lesions. 

Model for Cholesteryl ester transfer protein over expression The apoB-100 transgenic mice 
express the himian enzyme, CETP, and consequently demonstrate a dramatically reduced 
level of serum HDL cholesterol. 

Model for apoB-100 and CETP overexpression J'he apoB-100 transgenic mice express both 
CETP and apoB-100, resulting in mice with a himian like serum HDL/LDL distribution. 
Following 6 months on a high-fat diet these mice develop significant foam cell accumulation 
xmderljdng the endothelium and within the media, as weU as cholesterol crystals and fibrotic 
lesions. 

ApoBlOO Transgenic Mice (Order Model #^s:1004-T (Ijemizygotes), B6 (control)) 
These mice express high levels of human apoB-100, resulting in mice with elevated serum 
levels of LDL cholesterol. These mice cire useful in identifying and evaluating compounds to 
reduce elevated levels of LDL cholesterol and the risk of atherosclerosis. When fed a high 
fat cholesterol diet, these mice develop significaiat foeim cell accimiulation underly the 
endothelium and within the media, and have significantly more complex atherosclerotic 
lesions than control animals. 
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Double Transgenic Mice, CETP/ApoBlOO (Order Model #: lOOV-TT) These mice express 
both CETP and apoB-100, resulting in a human-like serum HDL/LDL distribution. These 
mice are usefiil for evaluating compounds to treat hypercholesterolemia or HDL/LDL 
cholesterol imbalance to reduce the risk of developmg atherosclerosis. When fed a high fat 
high cholesterol diet, these mice develop significant foam cell accumulation underlying the 
endothelium and within the media, and have significantly more complex atherosclerotic 
lesions than control animals. 

ApoE gene knockout mouse Homozygous apoE knockout mice exhibit strong 
hypercholesterolemia, primarily due to elevated levels of VLDL and IDL caused by a defect 
in lipoprotein clearance from plasma. These mice develop atherosclerotic lesions which 
progress with age and resemble human lesions (Zhang et al, Science 258:46-71, 1992; 
Plump et al. Cell 71 :343-353, 1992; Nakashima et al, Arterioscler Thromp. 14:133-140, 
1994; Reddick etal, Arterioscler Tromb. 14:141-147, 1994). These mice are a promising 
model for studying the effect of diet and drugs on atherosclerosis. 

Low density lipoprotein receptor (LDLR) mediates lipoprotein clearance from plasma 
through the recognition of apoB and apoE on the surface of lipoprotein particles. Humans, 
who lack or have a decreased number of the LDL receptors, have familial 
hypercholesterolemia and develop CHD at an early age. 

ApoE Knockout Mice (Order Model #: APOE-M) The apoE knockout mouse was created by 
gene targeting in embryonic stem cells to disrupt the apoE gene. ApoE, a glycoprotein, is a 
structuial component of very low density lipoprotein (VLDL) synthesized by the liver and 
intestinally synthesized chylomicrons. It is also a constituent of a subclass of high density 
lipoproteins (HDLs) involved in cholesterol transport activity among cells. One of the most 
important roles of apoE is to mediate high affinity binding of chylomicrons and VLDL 
particles that contain apoE to the low density lipoprotein (LDL) receptor. This allows for the 
specific uptake of these particles by the fiver which is necessary for transport preventing the 
accumulation in plasma of cholesterol-rich remnants. The homozygous inactivation of the 
apoE gene results in animals that are devoid of apoE in their sera. The mice appear to 
develop normally, but tliey exhibit five times the normal serum plasma cholesterol and 
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spontaneous atherosclerotic lesions. This is similar to a disease in people who have a variant 
form of the apoE gene that is defective in binding to the LDL receptor and are at risk for 
early development of atherosclerosis and increased plasma triglyceride and cholesterol 
levels. There are indications that apoE is also involved in immune system regulation, nerve 
5 regeneration and muscle differentiation. The apoE knockout mice can be used to study the 
role of apoE in lipid metabolism, atherogenesis, and nerve injury, and to investigate 
intervention therapies that modify tlie atherogenic process. 

Apoe4 Targeted Replacement Mouse (Order Model #: 001549-'M) ApoE is a plasma protein 
involved in cholesterol transport, and the three human isoforms (E2, E3, and E4) have been 

10 associated with atherosclerosis and Alzheimer's disease. Gene targeting of 129 ES cells was 
used to replace the coding sequence of mouse apoE with human APOE4 without distm'bing 
the murine regulatory sequences. The E4 isoform occurs in approximately 14% of the 
human population and is associated with increased plasma cholesterol and a greater risk of 
coronary artery disease. The Taconic apoE4 Targeted Replacement model has noimal 

15 plasma cholesterol and triglyceride levels, but altered quantities of different plasma 

lipoprotein particles. This model also has delayed plasma clearance of cholesterol-rich 
lipoprotein particles (VLDL), with only half the clearance rate seen in the apoE3 Targeted 
Replacement model. Like the apoE3 model, the apoE4 mice develop altered plasma 
lipoprotein values and atherosclerotic plaques on an atherogenic diet. However, the 

20 atherosclerosis is more severe m the apoE4 model, with larger plaques and cholesterol apoE 
and apoB--48 levels twice that seen in the apoE3 model. The Taconic apoE4 Targeted 
Replacement model, along with the apoE2 and apoE3 Targeted Replacement Mice, provide 
an excellent tool for in vivo study of the human apoE isoforms. 

CETP Transgenic Mice (Order Model #: 1003-T) These animals express the hmnan plasma 
25 enzyme, CETP, resulting in mice with a dramatic reduction in serum HDL cholesterol. The 
mice can be useful in identifying and evaluating compounds that increase the levels of HDL 
cholesterol for reducing the risk of developing atherosclerosis 

Trans gene/Promoter: human apolipoprotein A-I These mice produce mouse HDL 
cholesterol particles that contain human apolipoprotein A-L Transgenic expression is life- 
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long in both sexes (Biochemical Genetics and Metabolism Laboratory, Rockefeller 
University, NY City). 

A Mouse Model for Abetalipoproteinemia Abetalipoproteinemia, an inherited human disease 
5 characterized by a near-complete absence of apoB-contauiing lipoproteins in the plasma, is 
caused by mutations in the gene for microsomal triglyceride transfer protein (MTP). Gene 
targeting was used to knock out the mouse MTP gene {Mttp). In heterozygous knockout 
mice (Mttp^^~\ the MTP mRNA, protein, and activity levels were reduced by 50% in both 
hver and intestine. Recent studies with heterozygous MTP knockout mice have suggested 

10 that half-normal levels of MTP in the liver reduce apoB secretion. They hypothesized that 
reduced apoB secretion in the setting of half-normal MTP levels might be caused by a 
reduced MTPrapoB ratio in the endoplasmic reticulum, which would reduce the number of 
apoB-MTP interactions. If this hypothesis were true, half-normal levels of MTP might have 
little impact on lipoprotein secretion in the setting of half-normal levels of apoB synthesis 

15 (since the ratio of MTP to apoB would not be abnormally low) and might cause an 

exaggerated reduction in lipoprotein secretion in the setting of apoB overexpression (since 
the ratio of MTP to apoB would be even lower). To test this hypothesis, they examined the 
effects of heterozygous MTP deficiency on apoB metabolism in the setting of normal levels 
of apoB synthesis, half-normal levels of apoB synthesis (heterozygous Apob deficiency), and 

20 increased levels of apoB synthesis (transgenic overexpression of human apoB). Contrary to 
their expectations, half-normal levels of MTP reduced plasma apoB-100 levels to the same 
extent (-25-35%) at each level of apoB synthesis. In addition, apoB secretion firom primary 
hepatocytes was reduced to a comparable extent at each level of apoB synthesis. Thus, these 
results indicate that the concentration of MTP within the endoplasmic reticulum, rather than 

25 the MTP:apoB ratio, is the critical determinant of lipoprotein secretion. Finally, 

heterozygosity for an apoB knockout mutation was found to lower plasma apoB-100 levels 
more than heterozygosity for an MTP knockout allele. Consistent with that result, hepatic 
triglyceride accumulation was greater in heterozygous apoB knockout mice than in 
heterozygous MTP knockout mice, CtdloxP tissue-specific recombination techniques were 

30 also used to generate liver-specific Mttp knockout mice. Inactivation of the Mttp gene in the 
liver caused a striking reduction in very low density lipoprotein (VLDL) triglycerides and 
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large reductions in both VLDL/low density lipoproteins (LDL) and high density lipoprotein 
cholesterol levels. Histologic studies in liver-specific knockout mice revealed moderate 
hepatic steatosis. Currently being tested is the hypothesis that accumulation of triglycerides 
in the liver renders the liver more susceptible to injury by a second insult (e.g., 
5 lipopolysaccharide). 

Human apo B (apolipoprotein B) Transgene mice show apo B locus may have a causative 
role male infertility The fertility of apoB (apolipoprotein B) (+/-) mice was recorded during 
the course of backcrossing (to C57BL/6J mice) and test mating. No apparent fertility 
problem was observed in female apoB (+/-) and wild-type female mice, as was documented 

10 by the presence of vaginal plugs in female mice. Although apoB (-1-A) mice mated normally, 
only 40% of the animals from the second backcross generation produced any offspring 
within the 4-month test period. Of the animals that produced progeny, litters resulted from 
< 50% of documented matings. In contrast, all wild-type mice (6/6-?.^., 100%) tested were 
fertile. These data suggest genetic influence on the infertility phenotype, as a small number 

15 of male heterozygotes were not sterile. Fertilization in vivo was dramatically impaired in 
male apoB (+/-) mice. 74% of eggs examined were fertiUzed by the sperm from wild-t3qpe 
mice, whereas only 3% of eggs examined were fertilized by the sperm from apoB (+/-) mice. 
The sperm counts of apoB (4-/-) mice were mildly but significantly reduced compared with 
controls. However, the percentage of motile sperm was markedly reduced in the apoB (+/-) 

20 animals compared with that of the wild-type controls. Of the sperm from apoB (+/-) mice, 
20% (/.e., 4.9% of the initial 20% motile sperm) remained motile after 6 hr of incubation, 
whereas 45% {i.e., 33.6% of the mitial 69.5%) of the motile sperm retained motility in 
controls after this time. In vitro fertilization yielded no fertilized eggs in three attempts with 
apo B (+/-) mice, while wild-type controls showed a fertilization rate of 53%. However, 

25 speim from apoB (+/-) mice fertilized 84% of eggs once the zona pellucida had been 

removed. Numerous sperm from apoB (+/-) mice were seen bmding to zona-intact eggs. 
However, these sperm lost then: motility when observed 4-6 hours after binding, showing that 
sperm from apoB (+/-) mice were unable to penetrate the zona pellucida but that the 
interaction between sperm and egg was probably not direct. Sperm binding to zona-free 

30 oocytes was abnormal. In the apoB (-1-/-) mice, sperm binding did not attenuate, even after 
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pronuclei had cleai-ly formed, suggesting that apoB deficiency results in abnormal surface 
interaction between the sperm and egg. 

Knockout of the mouse apoB gene resulted in embryonic lethality in homozygotes, 
protection against diet-induced hypercholesterolemia in heterozygotes, and developmental 
5 abnormalities in mice. 

Model of insulin resistance, dyslipidemia & overexpression of human apoB It was shown 
that the livers of apoB mice assemble and secrete increased numbers of VLDL particles. 

E:^ample 3. Treatment of Diabetes Type-2 with IRNA 

10 Introduction The regulation of hepatic gluconeogenesis is an important process in the 

adjustment of the blood glucose level. Pathological changes in the glucose production of the 
liver are a central characteristic in type-2-diabetes. For example, the fasting hyperglycemia 
observed in patients witli type-2-diabetes reflects the lack of inhibition of hepatic 
gluconeogenesis and glycogenolysis due to the underlying insulin resistance in this disease. 

15 Extreme conditions of insulin resistance can be observed for example in mice with a liver^ 
specific insulin receptor knockout ('LIRKO'). These mice have an increased expression of 
the two rate-limiting gluconeogenic enzymes, phosphoenolpyruvate carboxykinase (PEPCK) 
and the glucose-6-phosphatase catalytic subunit (G6Pase). Insulin is known to repress both 
PEPCK and G6Pase gene expression at the transcriptional level and the signal transduction 

20 involved in the regulation of G6Pase and PEPCK gene expression by insulin is only partly 
understood. While PEPCK is involved in a very early step of hepatic gluconeogenesis 
(synthesis of phosphoenolpyruvate from oxaloacetate), G6Pase catalyzes the terminal step of 
both, gluconeogenesis and glycogenolysis, the cleavage of glucose-6-phosphate into 
phosphate and free glucose, which is then delivered into the blood stream. 

25 The pharmacological intervention in the regulation of expression of PEPCK and 

G6Pase can be used for the treatment of the metabolic aberrations associated with diabetes. 
Hepatic glucose production can be reduced by an iRNA-based reduction of PEPCK and 
G6Pase enzymatic activity in subjects with type-2-diabetes. 
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Targets for iRNA 

GIucose-6-phosphatase (G6Pase) 

G6Pase mRNA is expressed principally in liver and kidney, and in lower amounts in 
the small intestine. Membrane-boimd G6Pase is associated with the endoplasmic reticulum. 

Low activities have been detected in skeletal muscle and in astroc3^es as well. 

G6Pase catalyzes the terminal step in gluconeogenesis and glycogenolysis. The 
activity of the enzyme is several fold higher in diabetic animals and probably in diabetic 
humans. Starvation and diabetes cause a 2-3-fold increase in G6Pase activity in the liver and 
a 2-4-fold increase in G6Pase mRNA. 



Fhosphoenolpyruvate carboiiiiykmase (PEPCK) 

Overexpression of PEPCK in mice results in symptoms of type-2-diabetes mellitus. 
PEPCK overexpression results in a metabolic pattern that increases G6Pase mRNA and 
results in a selective decrease in insxilin receptor substrate (IRS)-2 protem, decreased 
phosphatidylinositol 3-kinase activity, and reduced ability of insulin to suppress 
gluconeogenic gene expression. 



Table 7. Other targets to inhibit hepatic glucose production 



Target 


Comment 


FKHR 


good evidence for antidiabetic phenotype 
(Nakae et ah, Nat Genetics 32:245(2002) 


Glucagon 




Glucagon receptor 




Glycogen phosphorylase 




PGC-1 (PPAR-Gamma 
Coactivator) 


regulates the cAMP response (and 
probably the PKB/FKHR-regulation) on 
PEPCKyG6Pase 


Fructose- 1 56-bisphosphatase 




Glucose-6-phospate translocator 




Glucokinase inhibitory 
regulatory protein 





Materials and Methods 

Animals: BKS.Cg-m +/+ Lepr db mice, which contain a point mutation in the leptin receptor 
gene are used to examme the efficacy of iRNA for the targets listed above. 
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BKS.Cg-m Lepr db are available from the Jackson Laboratory (Stock Number 
000642). These animals are obese at 3-4 weeks after birth, show elevation of plasma insulin 
at 10 to 14 days, elevation of blood sugar at 4 to 8 weeks, and uncontrolled rise in blood 
sugar. Exogenous insulin fails to control blood glucose levels and gluconeogenic activity 
5 increases. 

The following numbers of male animals (age>12 weeks) would ideally be tested with 
the following iRNAs: 

PEPCK, 2 sequences, 5 animals per sequence 

G6Pase, 2 sequences, 5 animals per sequence 
10 1 nonspecific sequence, 5 animals 

1 control group (only injected, no siRNA), 5 animals 
1 control group (not injected, no siRNA), 5 animals 

Reagents: Necessary reagents would ideally include a Glucometer Elite XL (Bayer, 
15 Pittsburgh, PA) for glucose quantification, and an Insulin Radioimmunoassay (RIA) kit 
(Amersham, Piscataway, NJ) for insulin quanitation 

Assays: 

G6P enzyme assays and PEPCK enzyme assays are used to measure the activity of tlie 
20 enzymes. Northem blotting is used to detect levels of G6Pase and PEPCK mRNA. 

Antibody-based techniques {e.g., immunoblotting, immimofluorescence) are used to detect 
levels of G6Pase and PEPCK protein. Glycogen staining is used to detect levels of glycogen 
in the liver. Histological analysis is performed to analyze tissues. 

25 Gene information: 

G6Pase GenBank® No.: NM_008061,Mus musculus glucose-6-phosphatase, catalytic 
(G6pc), mRNA 1..2259, ORF 83.. 1 156; 

GeriBank® No: U00445,Mus musculus glucose-6-phosphatase mRNA, complete cds 
1..2259, ORF 83..1156 
30 GenBanlc® No: BC013448 

PEPCK 
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GenBank® No: NM_01 1044, Mus musculus phosphoenolpyruvate carboxykinase 1, 
cytosolic (Pckl), mRNA,1..2618, ORF 14L.2009 
GenBank® No: AF009605.1 

Administration of iRNA: 

iRNA corresponding to the genes described above would be administered to mice 
with hydrodynamic injection. One control group of animals would be treated with 
Metformin as a positive control for reduction in hepatic glucose levels. 

Experimental Protocol 

Mice would be housed in a facility in which there is light from 7:00 AM to 7:00 PM. 
Mice would be fed ad libidum from 7:00 PM to 7:00 AM and fast from 7:00 AM to 7:00 PM. 

Day 0: 7:00 PM: Approximately 100 |ll1 blood would be drawn from the tail. Serum would 
be isolated to measure glucose, insulin, HbAlc (EDTA-blood), glucagon, FFAs, lactate, 
corticosterone, serum triglycerides. 

Day 1-7: Blood glucose would be measured daily at 8:00 AM and 6:00 PM (approx. 3-5 jil; 
measured with a Haemoglucometer) 

Day 8: Blood glucose would be measured daily at 8:00 AM and 6:00 PM. iRNA would be 
injected between 10:00 AM and 2:00 PM 

Day 9-20: Blood glucose would be measured daily at 8:00 AM and 6:00 PM. 
Day 21: Mice would be sacrificed after 10 hours of fasting. 

Blood would be isolated. Glucose, insulin, HbAlc (EDTA-blood), glucagon, FFAs, lactate, 
corticosterone, serum triglycerides would be measured. Liver tissue would be isolated for 
histology, protein assays, RNA assays, glycogen quantitation, and enzyme assays. 
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Example 4: Inhibition of Glucose-6-Fhosphatase iRNA in vivo 

IRNA targeted to the Glucose-6-Phosphatase (G6P) gene was used to examine the 
effects of inhibition of G6P expression on glucose metabolism in vivo. 

Female mice, 10 weeks of age, strain BKS.Cg-m +/+ Lepr db (The Jackson 
Laboratory) were used for in vivo analysis of enzymes of the hepatic glucose production. 
Mice were housed under conditions where it was light from 6:30 am to 6:30 pm. Mice were 
fed (ad libidum) during the night period and fasted during the day period. 

On day 1, approximately 100|li1 of blood was collected from test animals by puncturing the 
retroorbital plexus. On days 1-7, blood glucose was measured in blood obtained from tail 
veins (approximately 3-5 |li1) using a Glucometer (Elite XL, Bayer). Blood glucose was 
sampled daily at 8 am and 6 pm. 

On day 7 at approximately 2pm, GL3 plasmid (10 |Lig) and siRNAs (100 |ag G6Pase 
specific, Renilla nonspecific or no siRNA control) were delivered to animals using 
hydrodynamic coinjection. 

On day 8, GL3 expression was analyzed by injection of luceferin (3 mg) after 
anaesthesia with avertin and imaging. This was done to control for successfiil hydrodynamic 
delivery. 

On days 8-1 0, blood glucose was measured in blood obtained from tail veins 
(approximately 3-5 ml) using a Glucometer (Elite XL, Bayer). 

On day 10, mice were sacrificed after 10 hours of fasting. Blood and liver were 
isolated from sacrificed animals. 

Results: Coinjection of GL3 plasmid and G6Pase iRNA (G6P4) reduced blood 
glucose levels for the short term. Coinjection of GL3 plasmid and Renilla nonspecific iRNA 
had no effect on blood glucose levels. 
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Example 5: Selected Palindromic Sequences 

Tables 8-13 below provide selected palindromic sequences from the following genes: human 
ApoB, himian gliicose-6-phosphatase, rat glucose-6-phosphatase, P-catenin, and hepatitis C 
vims (HCV). 
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Table 8. Selected palindromic sequences from human ApoB 







Source 


Start 

Index 


End 

Index 




nnatcn 


Start 
Index 


End 
Index 




D 
D 


SEQ ID NO: 


1 


g g ccattccagaag g g aag 


509 


528 


SEQ ID NO: 1004 


cttccgttctgtaatggcc 


5795 


5814 


1 


9 


SEQ ID NO: 


2 


tgccatctcgagagttcca 


4099 


4118 


SEQ ID NO: 1005 


tggaactctctccatggca 


10876 


10895 


1 


8 


SEQ ID NO: 3 


oatgtcaaacactttgtta 


7056 


7076 


SEQ ID NO: 1006 


taacaaattccttgacatg 


7358 


7377 


1 


8 


SEQ ID NO: 


4 


tttgttataaatcttattg 


7068 


7087 


SEQ ID NO: 1007 


caal aa g ate aatagcaaa 


8990 


9009 


1 


8 


SEQ ID NO: 5 


tctggaaaagggtcatgga 


8880 


8899 


SEQ ID NO: 1008 


tccatgtcccatttacaga 


11356 


11375 


1 


8 


SEQ ID NO: 


6 


cagctcttgttcaggtcca 


10900 


10919 


SEQ ID NO: 1009 


tg g acctgcaccaaagctg 


13952 


13971 


1 


8 


SEQ ID NO: 


7 


ggaggttccccagctctgc 


356 


375 


SEQ ID NO: 1010 


g cag ccctg g g aaaactcc 


6447 


6466 


1 


7 


SEQ ID NO: 8 


ctg ttttgaag actctcca 


1081 


1100 


SEQ ID NO: 1011 


tggagggtagtcataacag 


10327 


10346 


1 


7 


SEQ ID NO: 


9 


agtggctgaaacgtgtgca 


1297 


1316 


SEQ ID NO: 1012 


tgcagagctttctgccact 


13508 


13527 


1 


7 


SEQ ID NO: 


10 


ccaaaatagaagggaatct 


2068 


2087 


SEQ ID NO: 1013 


agattcctttgccttttgg 


4000 


4019 


1 


7 


SEQ ID NO: 


11 


tgaagagaagattgaattt 


3620 


3639 


SEQ ID NO: 1014 


aaattctcttttcttttca 


9212 


9231 


1 


7 


SEQ ID NO: 


12 


agtggtggcaacaccagca 


4230 


4249 


SEQ ID NO: 1015 


tgctagtgaggccaacact 


10649 


10668 


1 


7 


SEQ ID NO: 


13 


aagg ctccacaagtcatca 


5950 


5969 


SEQ ID NO: 1016 


tg atg atatctggaacctt 


10724 


10743 


1 


7 


SEQ ID NO: 


14 


gtcagccaggtttatagca 


7725 


7744 


SEQ ID NO: 1017 


tgctaagaaccttactgac 


7781 


7800 


1 


7 


SEQ ID NO: 


15 


tgatatctggaaccttgaa 


10727 


10746 


SEQ ID NO: 1018 


ttcactgttcctgaaatca 


7863 


7882 


1 


7 


SEQ ID NO: 


16 


gtcaagttg agcaatttct 


13423 


13442 


SEQ ID NO: 1019 


agaaaaggcacaccttgac 


11072 


11091 


1 


7 


SEQ ID NO: 


17 


atccagatggaaaagggaa 


13480 


13499 


SEQ ID NO: 1020 


ttccaatttccctgtggat 


3680 


3699 


1 


7 


SEQ ID NO: 


18 


atttgtttgtcaaagaagt 


4543 


4562 


SEQ ID NO: 1021 


acttcagagaaatacaaat 


11401 


11420 4 


6 


SEQ ID NO: 


19 


ctggaaaatgtcagcctgg 


204 


223 


SEQ ID NO: 1022 


ccagacttccgtttaccag 


8235 


8254 


2 


6 


SEQ ID NO: 


20 


accaggaggttcttcttca 


1729 


1748 


SEQ ID NO: 1023 


tg aa gtg tag tctcctg g t 


5089 


5108 


2 


6 


SEQ ID NO: 


21 


aaag aagttctg aaag aat 


1956 


1975 


SEQ ID NO: 1024 


attccatcacaaatccttt 


9661 


9680 


2 


6 


SEQ ID NO: 


22 


gctacagcttatggctcca 


3570 


3589 


SEQ ID NO: 1025 


tggatctaaatgcagtagc 


11623 


11642 2 


6 


SEQ ID NO: 23 


atcaatattgatcaatttg 


6414 


6433 


SEQ ID NO: 1026 


caaagaagtcaagattgat 


4553 


4572 


2 


6 


SEQ ID NO: 24 


gaattatcttttaaaacat 


7326 


7345 


SEQ ID NO: 1027 


atgtgttaacaaaatattc 


11494 


11513 2 


6 


SEQ ID NO: 


25 


cgaggcccgcgctgctggc 


130 


149 


SEQ ID NO: 1028 


gccagaag tg a g atcctcg 


3507 


3526 


1 


6 


SEQ ID NO: 


26 


acaactatgaggctgagag 


271 


290 


SEQ ID NO: 1029 


ctctgagcaacaaatttgt 


10309 


10328 1 


6 


SEQ ID NO: 27 


gctgagagttccagtggag 


282 


301 


SEQ ID NO: 1030 


ctccatggcaaatgtcagc 


10885 


10904 


1 


6 


SEQ ID NO: 


28 


tgaagaaaaccaagaactc 


448 


467 


SEQ ID NO: 1031 


gagtcattgaggttcttca 


4929 


4948 


1 


6 


SEQ ID NO: 


29 


cctacttacatcctgaaca 


558 


577 


SEQ ID NO: 1032 


tgttcataagggaggtagg 


12766 


12785 


1 


6 


SEQ ID NO: 


30 


otacttacatcctg aacat 


559 


578 


SEQ ID NO: 1033 


atgttcataagggaggtag 


12765 


12784 


1 


6 


SEQ ID NO: 


31 


gagacagaagaagocaagc 615 


634 


SEQ ID NO: 1034 


gcttggttttgccagtctc 


2459 


2478 


1 


6 


SEQ ID NO: 


32 


cactcactttaccgtcaag 


671 


690 


SEQ ID NO: 1035 


cttgaacacaaagtcagtg 


6000 


5019 


1 


6 


SEQ ID NO: 


33 


ctgatcagcagcagccagt 


822 


841 


SEQ ID NO: 1036 


actgggaagtgcttatcag 


5237 


5256 


1 


6 


SEQ ID NO: 


34 


actggacgctaagaggaag 


854 


873 


SEQ ID NO: 1037 


cttccccaaagag accagt 


2890 


2909 


1 


6 
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SEQ ID NO: 


35 


agaggaagcatgtggcaga 


865 


884 


SEQ ID NO: 1038 


tctggcatttactttctct 


5921 


5940 


1 


6 


SEQ ID NO: 


36 


tgaagactctccaggaact 


1087 


1106 


SEQ ID NO: 1039 


agttgaaggagactattca 


7216 


7235 


1 


6 


SEQ ID NO: 


37 


ctctg agcaaaatatccag 


1121 


1140 


SEQ ID NO; 1040 


ctggttactgagctgagag 


1161 


1180 


1 


6 


SEQ ID NO: 


38 


atgaagcagtcacatctct 


1189 


1208 


SEQ ID NO: 1041 


agagctgccagtccttcat 


10016 


10035 


1 


6 


SEQ ID NO: 


39 


ttgccacagctgatigagg 


1209 


1228 


SEQ ID NO: 1042 


cctcctacagtggtggcaa 


4222 


4241 


1 


6 


SEQ ID NO: 


40 


agctgattgaggtgtccag 


1216 


1235 


SEQ ID NO: 1043 


ctggattccacatgcagct 


1 1 847 


11866 


1 


6 


SEQ ID NO: 


41 


tgctccactcacatcctcc 


1278 


1297 


SEQ ID NO: 1044 


ggaggctttaagttcagca 


7601 


7620 


1 


6 


SEQ ID NO: 


42 


tgaaacgtgtgcatgccaa 


1303 


1322 


SEQ ID NO: 1045 


ttgggagagacaagtttca 


6500 


6519 


1 


6 


SEQ ID NO: 


43 


gacattgctaattacctga 


1503 


1522 


SEQ ID NO: 1046 


tcagaagctaagcaatgtc 


7232 


7251 


1 


6 


SEQ ID NO: 


44 


ttcttcttcagactttcct 


1738 


1757 


SEQ ID NO: 1047 


aggagagtccaaattagaa 


8498 


8517 


1 


6 


SEQ ID NO: 


45 


ccaatatcttgaactcaga 


1903 


1922 


SEQ ID NO: 1048 


tctgaattcattcaattgg 


6485 


6504 


1 


6 


SEQ ID NO: 


46 


aaagttagtgaaagaagtt 


1946 


1965 


SEQ ID NO: 1049 


aactaccctcactgccttt 


2132 


2151 


1 


6 


SEQ ID NO: 


47 


aagttagtgaaagaagttc 


1947 


1966 


SEQ ID NO: 1050 


gaacctctg gcatttactt 


5916 


5935 


1 


6 


SEQ ID NO: 


48 


aaagaagttctgaaagaat 


1956 


1975 


SEQ ID NO: 1051 


attctctggtaactacttt 


5482 


5501 


1 


6 


SEQ ID NO: 


49 


tttggctataccaaagatg 


2322 


2341 


SEQ ID NO: 1052 


catcttaggcactgacaaa 


4997 


5016 


1 


6 


SEQ ID NO: 


50 


tgttgagaagctgattaaa 


2381 


2400 


SEQ ID NO: 1053 


tttagccatcggctcaaca 


5700 


5719 


1 


6 


SEQ ID NO: 


51 


caggaagggctcaaagaat 


2551 


2580 


SEQ ID NO: 1054 


attcctttaacaattcctg 


9492 


9511 


1 


6 


SEQ ID NO: 


52 


aggaagggctcaaagaatg 


2562 


2581 


SEQ ID NO: 1055 


cattcctttaacaattcct 


9491 


9510 


1 


6 


SEQ ID NO: 


53 


gaagggctcaaagaatgac 


2564 


2583 


SEQ ID NO: 1056 


gtcagtcttcaggctcttc 


7914 


7933 


1 


6 


SEQ ID NO: 


54 


caaagaatg acttttttct 


2572 


2591 


SEQ ID NO: 1057 


agaaggatggcattttttg 


14000 


14019 1 


6 


SEQ ID NO: 


55 


catggagaatgcctttgaa 


2603 


2622 


SEQ ID NO: 1058 


ttcagagccaaagtccatg 


7119 


7138 


1 


6 


SEQ ID NO: 


56 


ggagccaaggctggagtaa 


2679 


2698 


SEQ ID NO: 1059 


ttactccaacgccagctcc 


3050 


3069 


1 


6 


SEQ ID NO: 


57 


tcattccttccccaaagag 


2884 


2903 


SEQ ID NO: 1060 


ctctctggggcatctatga 


5139 


5158 


1 


6 


SEQ ID NO: 


58 


acctatgagctccagagag 


3165 


3184 


SEQ ID NO: 1061 


ctctcaagaccacagag gt 


12976 


12995 


1 


6 


SEQ ID NO: 


59 


gggcaaaacgtcttacaga 


3365 


3384 


SEQ ID NO: 1062 


tctgaaagacaacgtgccc 


12317 


12336 


1 


6 


SEQ ID NO: 


60 


accctggacattcagaaca 


3387 


3406 


SEQ ID NO: 1063 


tgttgctaaggttcagggt 


5675 


5694 


1 


6 


SEQ ID NO: 


61 


atgggcgacctaagttgtg 


3429 


3448 


SEQ ID NO: 1064 


cacaaattagtttcaccat 


8941 


8960 


1 


6 


SEQ ID NO: 


62 


gatgaagagaagattgaat 


3618 


3637 


SEQ ID NO: 1065 


attccagcttccocacatc 


8330 


8349 


1 


6 


SEQ ID NO: 


63 


caatgtagataccaaaaaa 


3656 


3675 


SEQ ID NO: 1066 


ttttttggaaatgccattg 


8643 


8662 


1 


6 


SEQ ID NO: 


54 


gtagataccaaaaaaatga 


3660 


3679 


SEQ ID NO: 1067 


tcatgtgatgggtctctac 


4371 


4390 


1 


6 


SEQ ID NO: 


65 


gcttcag ttcatttg g act 


4509 


4528 


SEQ ID NO: 1068 


agtcaagaaggacttaagc 


5304 


5323 


1 


6 


SEQ ID NO: 


66 


tttgtttgtcaaagaagtc 


4544 


4563 


SEQ ID NO: 1069 


gacttcagagaaatacaaa 


11400 


11419 


1 


6 


SEQ ID NO: 


67 


ttgtttgtcaaagaagtca 


4545 


4564 


SEQ ID NO: 1070 


tgacttcagagaaatacaa 


11399 


11418 


1 


6 


SEQ ID NO: 


68 


tggcaatgggaaactcgct 


5846 


5865 


SEQ ID NO: 1071 


agcgag aatcaccctgcca 


8219 


8238 


1 


6 


SEQ ID NO: 


69 


aacctctggcatttacttt 


5917 


5936 


SEQ ID NO: 1072 


aaaggagatgtcaagggtt 


10599 


10618 


1 


5 


SEQ ID NO: 


70 


catttactttctctcatg a 


5926 


5945 


SEQ ID NO: 1073 


tcatttgaaag aataaatg 


7026 


7045 


1 


6 


SEQ ID NO: 


71 


aaagtcagtgccctgctta 


5009 


6028 


SEQ ID NO: 1074 


taagaaccttactgacttt 


7784 


7803 


1 


6 
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SEQ ID NO: 


72 


tcccattttttqa q a c ctt 


6322 


6341 


SEQ ID NO* 1075 


aaonaRttcaaoaatririaa 


-12004 




1 


6 


SEQ ID NO: 


73 


catcaatattgatcaattt 


6413 


6432 


SEQ ID NO' 1076 


aaattaaaaaatnt tn atn 

ex CI CI LLCICIdCldU IVi/i O 


67*^9 


67*51 


1 
1 


6 


SEQ ID NO: 


74 


ta aa g a ta g ttatg at tta 


6665 


6684 


SEQ ID NO* 1077 


taaaccaaaanttaattta 


901 9 


9038 


1 
1 


6 

w 


SEQ ID NO: 


75 


tattqatqaaatcattaaa 


6713 


6732 

\^ t ^^^^ 


SEQ ID NO* 1078 


L IwdddU do I LCtO ado Q LCl 


finny 


fin9fi 


1 


6 


SEQ ID NO: 


76 


atqatctacatttqtttat 


6790 


6809 


SEQ ID NO* 1079 


at a a an a a att a fl tpsit 








6 


SEQ ID NO: 


77 


aaaaacacatacaaaatat 


6919 


5938 


SEQ ID NO' 1080 


atatattntnsntfiprtpt 






1 


6 


SEQ ID NO: 


78 


a acacatacaci aatataa a 


6922 


6941 


SEQ ID NO* 1081 


tctsaattrpinttpttnfp 


\ 1 / 


1 1 ^^4^ 


1 
1 


R 


SEQ ID NO: 


79 


aacatcitcaaacactttat 


7054 


7073 


SEQ ID NO* 1082 


afiflaantpantnpppiTipt 


finny 


BOPfi 


1 
1 




SEQ ID NO: 


80 


tttttaaaaaaaaccaaaa 


7515 


7534 


SEQ ID NO' 1083 


ppHintoiapappaasiaa 




1 1 P4Q 

1 1 t.'-fw 


1 




SEQ ID NO: 81 


ttttaciaaoaaaccaao oc 


7516 


7535 


SEQ ID NO* 1084 


Qcrsttiatdtananrtaaaa 

y wwLLLy i.y Lciwaowciocici 


i 1 


1 1948 






SEQ ID NO: 


82 


Q Q aaa atao acttccta aa 


9307 


9326 


SEQ ID NO' 1085 


ttnaaaaatactattttnn 




12843 


1 
1 


6 


SEQ ID NO: 


83 


cactq tttctoa q tcccaq 


9334 


9353 


SEQ ID NO: 1086 


ctaaaacctaccaao aal a 


12523 


12542 

1 ^« W^T^_ 


1 


6 


SEQ ID NO: 84 


cacaaatcctttaactata 


9668 


9687 


SEQ ID NO- 1087 


cscatttcaa a a aattota 


10063 

1 www 


10082 

1 V w^ 


1 


6 


SEQ ID NO: 


85 


ttcctqqatacactattcc 


9853 


9872 


SEQ ID NO' 1088 


□aaactottaactcaansa 

V^Vg G4W1W kw VfcwC^w^wOl^l WjCBCl 


12569 


12588 




6 
\j 


SEQ ID NO: 86 


g aaatctcaagctttctct 


10042 


10061 


SEQ ID NO: 1089 


aoaaccaaatcQaoctttc 


11044 


11063 


1 


6 


SEQ ID NO: 


87 


tttcttcatcttcatctgt 


10210 


10229 


SEQ ID NO: 1090 


acaactaaaaaaaataaaa 


13055 


13074 


1 


6 

w 


SEQ ID NO: 


86 


tctaccgctaaaggagcag 


10521 


10540 


SEQ ID NO* 1091 


ctacacactttGaaataaa 


iiy6i 


11780 


1 


6 


SEQ ID NO: 


89 


ctaccqctaaaqqaqcaqt 


10522 


10541 


SEQ ID NO: 1092 


actacacQctttaaaataa 


11760 


11779 


1 


6 

w 


SEQ ID NO: 


90 


aqqqcctctttttcaccaa 


10831 


10850 

■ ^^^^ 


SEQ ID NO' 1093 


ttaaccaaaaaataaccct 


10957 


10976 


1 


6 


SEQ ID NO: 


91 


ttctccatccctcitaaaaq 


11265 


11284 


SEQ ID NO: 1094 


ctttttcaccaa ca a a a a a 


10838 

1 Www 


10857 


1 


6 


SEQ ID NO: 


92 


qaaaaacaaaqcaqattat 


11816 


11835 

I 1 ^^^^ 


SEQ ID NO: 1095 


ataaactacaaaatttttc 


13600 


13619 


1 


6 


SEQ ID NO: 


93 


actcactcattq attttct 


12682 


12701 


SEQ ID NO' 1096 


aoaaaatcaaaatctaaat 


14027 


14046 


1 


6 


SEQ ID NO: 


94 


taaactaataaatataatc 


12890 


12909 


SEQ ID NO* 1097 


aattaGcaccaacaattta 

LLdvwdWWdM V^dU LI LCI 


13578 


13597 1 


6 


SEQ ID NO: 


95 


caaaacqaacttcaaaaaa 


13200 


13219 


SEQ ID NO' 1098 


cttcataaaa aatatttta 


13260 

1 WbWW 


13279 


1 


6 


SEQ ID NO: 


96 


tqqaataatactcaatatt 


2366 


2385 


SEQ ID NO' 1099 


aacacttacttaaattcca 

(^(^wrdwLldwlLVJ ddLLwwd 


10662 


10681 


3 


5 

w 


SEQ ID NO: 


97 


aatttoaaatccaaaaaaa 

23^^ "*"25 ^^^^ ^^^^ \^ >^ 


2400 


2419 


SEQ ID NO* 1100 


cttcaaaaaaatacaaatn 

VlLWd^ dMddd Ldwddd^v 


1 1402 

1 1 *T W^ 


11421 


3 


5 


SEQ ID NO: 


98 


atttq aaatccaaao aaa t 

2J ^^^^ ^^^^ ^^%A24 ^ 


2401 


2420 


SEQ ID NO' 1101 


acttcaaaaaaatacaaat 


1 1401 


11420 


3 


5 


SEQ ID NO: 99 


atcaacaaccQcttcttta 


990 


1009 


SEQ ID NO* 1102 


caaaaaaatcaaaattoat 


4553 


4572 


2 


5 


SEQ ID NO: 


100 


tg ttttg aa g actctccag 


1082 


1101 


SEQ ID NO: 1103 


ctaaaaaattaaaacaaca 


6955 

^^^^ 


6974 


2 


5 


SEQ ID NO: 


101 


cccttctqataa atataa t 


1324 


1343 


SEQ ID NO- 1 104 


acca aaa eta a caeca a CIO 


13961 


13980 


2 


5 


SEQ ID NO: 


102 


tgagcaagtgaagaacttt 


1868 


1887 


SEQ ID NO: 1105 


aaaaccattcaatctctca 


12963 

1 W W \^ 


12982 2 


5 


SEQ ID NO: 


103 


atttg aaatccaa ag aag t 


2401 






aciiLiciaaauiigaaaL 




9074 


2 


O 


SEQ ID NO: 


104 


atccaaagaagtcccggaa 


2408 


2427 


SEQ ID NO: 1107 


ttccggggaaacctgggat 


12721 


12740 2 


5 


SEQ ID NO: 


105 


agagcctacctccgcatct 


2430 


2449 


SEQ ID NO: 1108 


agatggtacgttagcctct 


11921 


11940 2 


5 


SEQ ID NO: 


106 


aatgcctttgaactcccca 


2610 


2629 


SEQ ID NO: 1109 


tg g g aactacaa tttcatt 


7012 


7031 


2 


5 


SEQ ID NO: 


107 


gaagtccaaattccggatt 


3297 


3316 


SEQ ID NO: 1110 


aatcttcaatttattcttc 


13815 


13834 2 


5 


SEQ ID NO: 


108 


tgcaagcagaagccagaag 3496 


3515 


SEQ ID NO: 1111 


cttcaggttccaicgtgca 


11376 


11395 2 


5 


SEQ ID NO: 


109 


gaagagaagattgaatttg 


3621 


3640 


SEQ ID NO: 1112 


caaaacctactgtctcttc 


1 0459 


10478 2 


5 
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SEQ ID NO: 


110 


atgctaaaggcacatatgg 


4597 


4616 


SEQ ID NO: 1113 


ccatatgaaagtcaagcat 


12656 


12675 2 


5 


SEQ ID NO: 


111 


tccctcacctccacctctg 


4737 


4756 


SEQ ID NO: 1114 


cagattctcagatgaggga 


8912 


8931 


2 


5 


SEQ ID NO: 


112 


atttacagctctgacaagt 


5427 


5446 


SEQ ID NO: 1115 


acttttctaaacttgaaat 


9055 


9074 


2 


5 


SEQ ID NO: 


113 


aggagcctaccaaaataat 


5594 


5613 


SEQ ID NO: 1116 


attatgttg aaacagtcct 


11830 


11849 2 


5 


SEQ ID NO: 


114 


aaagctgaagcacatcaat 


6401 


6420 


SEQ ID NO: 1117 


attgttgctcatctccttt 


10194 


10213 2 


5 


SEQ ID NO: 


115 


ctgctggaaacaacgagaa 


9418 


9437 


SEQ ID NO: 1118 


ttctgattaccaccagcag 


13574 


13593 2 


5 


SEQ ID NO: 


116 


ttgaaggaattcttgaaaa 


9582 


9601 


SEQ ID NO: 1119 


ttttaaaagaaatcttcaa 


13805 


13824 2 


5 


SEQ ID NO: 


1 17 


gaagtaaaagaaaattttg 


10743 


10762 


SEQ ID NO: 1120 


caaaacctactQtctcttc 


1 0459 


10478 2 


5 


SEQ ID NO: 


118 


tgaagaagatggcaaattt 


11984 


12003 


SEQ ID NO: 1121 


aaatgtcagctcttgttca 


10894 


10913 2 


5 


SEQ ID NO: 


119 


aggatctgagttattttgc 


14035 


14054 


SEQ ID NO: 1122 


gcaagtcagcccagttcct 


10920 


10939 2 


5 


SEQ ID NO: 


120 


gtgcccttctcggttgctg 


18 


37 


SEQ ID NO: 1123 


cagccattg acatgag cac 


5740 


5759 


1 


5 


SEQ ID NO: 


121 


ggcgctgcctgcgctgctg 


146 


165 


SEQ ID NO: 1124 


cagctccacagactccgcc 


3062 


3081 


1 


5 


SEQ ID NO: 


122 


ctgcgctgctgctgctgct 


154 


173 


SEQ ID NO: 1125 


agcagaaggtgcgaagcag 


3224 


3243 


1 


5 


SEQ ID NO: 


123 


gctgctggcgggcgccagg 


170 


189 


SEQ ID NO: 1126 


cctg g attccaca tg cag c 


11846 


11865 


1 


5 


SEQ ID NO: 


124 


aagaggaaatgctggaaaa 


193 


212 


SEQ ID NO: 1127 


tttttcttcactacatctt 


2584 


2603 


1 


5 


SEQ ID NO: 


125 


ctggaaaatgtcagcctgg 


204 


223 


SEQ ID NO: 1128 


ccagacttccacatcccag 


3915 


3934 


1 


5 


SEQ ID NO: 


126 


tggagtccctgggactgct 


296 


315 


SEQ ID NO: 1129 


agcatgcctagtttctcca 


9945 


9964 


1 


5 


SEQ ID NO: 


127 


ggagtccctgggactgctg 


297 


316 


SEQ ID NO: 1130 


cagcatgcctagtttctcc 


9944 


9963 


1 


5 


SEQ ID NO: 


128 


tg g g actg ctg attcaag a 


305 


324 


SEQ ID NO: 1131 


tcttccatcacttg accca 


2042 


2061 


1 


5 


SEQ ID NO: 


129 


ctgctgattcaagaagtgc 


310 


329 


SEQ ID NO: 1132 


gcacaccttgacattgcag 


11079 


11098 


1 


5 


SEQ ID NO: 


130 


tgccaccaggatcaactgc 


326 


345 


SEQ ID NO: 1133 


gcaggctgaactggtggca 


2717 


2736 


1 


5 


SEQ ID NO: 


131 


gccaccaggatcaactgca 


327 


346 


SEQ ID NO: 1134 


tgcaggctgaactggtggc 


2716 


2735 


1 


5 


SEQ ID NO: 


132 


tgcaaggttgagctggagg 


342 


361 


SEQ ID NO: 1135 


cctccacctctgatctgca 


4744 


4763 


1 


5 


SEQ ID NO: 


133 


caaggttgagctggaggtt 


344 


363 


SEQ ID NO: 1136 


aacccctacatgaagcttg 


13755 


13774 


1 


5 


SEQ ID NO: 


134 


ctctgcagcttcatcctga 


369 


388 


SEQ ID NO: 1137 


tcaggaagcttctcaagag 


13211 


13230 


1 


5 


SEQ ID NO: 


135 


cagcttcatcctgaagacc 


374 


393 


SEQ ID NO: 1138 


ggtcttgagttaaatgctg 


4977 


4996 


1 


5 


SEQ ID NO: 136 


gcttcatcctgaagaccag 


376 


395 


SEQ ID NO: 1139 


ctggacgctaagaggaagc 


855 


874 


1 


5 


SEQ ID NO: 


137 


tcatcctgaagaccagcca 


379 


398 


SEQ ID NO: 1140 


tggcatggcattatgatga 


3604 


3623 


1 


5 


SEQ ID NO: 


138 


gaaaaccaagaactctgag 


452 


471 


SEQ ID NO: 1141 


ctcaaccttaatgattttc 


8286 


8305 


1 


5 


SEQ ID NO: 


139 


agaactctgaggagtttgc 


460 


479 


SEQ ID NO: 1142 


gcaagctatacagtattct 


8377 


8396 


1 


5 


SEQ ID NO: 


140 


tctgaggagtttgctgcag 


465 


484 


SEQ ID NO: 1143 


ctgcaggggatcccccaga 


2526 


2545 


1 


5 


SEQ ID NO: 


141 


tttgctgcagccatgtcca 


474 


493 


SEQ ID NO: 1144 


tggaagtgtcagtggcaaa 


10372 


10391 


1 


5 


SEQ ID NO: 


142 


caagaggggcatcatttct 


578 


597 


SEQ ID NO: 1145 


agaataaatgacgttcttg 


7035 


7054 


1 


5 


SEQ ID NO; 


143 


tcactttaccgtcaagacg 


674 


693 


SEQ ID NO; 1146 


cgtctacactatcatgtga 


4360 


4379 


1 


5 


SEQ ID NO: 


144 


tttaccgtcaagacgagga 


678 


697 


SEQ ID NO: 1147 


tccttgacatgttgataaa 


7366 


7385 


1 


5 


SEQ ID NO; 


145 


cactggacgctaagaggaa 


853 


872 


SEQ ID NO: 1148 


ttccagaaagcagccagtg 


12498 


12517 


1 


5 


SEQ ID NO: 


146 


aggaagcatgtggcagaag 


867 


886 


SEQ ID NO: 1149 


cttcatacacattaatcct 


9988 


10007 


1 


5 


SEQ ID NO: 


147 


caaggagcaacacctcttc 


893 


912 


SEQ ID NO: 1150 


gaagtagtactgcatcttg 


6835 


6854 


1 


5 
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t>tU lU NG 


K 148 


acagactttgaaacttgaa 


959 


978 


SEQ ID NO 


: 149 


tgatgaagcagtcacatct 


1187 


1206 


SEQ ID NO 


': 150 


agcagtcacatctctcttg 


1193 


1212 


SEQ ID NO 


K 151 


ccagccccatcactttaca 


1231 


1250 


SEQ ID NO 


: 152 


ctccactcacatcctccag 


1280 


1299 


SEQ ID NO 


: 153 


catgccaacccccttctga 


1314 


1333 


SEQ ID NO 


: 154 


gagagatcttcaacatggc 


1390 


1409 


SEQ ID NO 


: 155 


tcaacatggcgagggatca 


1399 


1418 


SEQ ID NO 


: 156 


ccaccttgtatgcgctgag 


1429 


1448 


SEQ ID NO 


: 157 


gtcaacaacta tcataag a 


1455 


1474 


SEQ ID NO 


: 158 


tg gacattg ctaatta cct 


1501 


1520 


SEQ ID NO 


: 159 


ggacattgctaattacctg 


1502 


1521 


SEQ ID NO 


: 160 


ttctgcgggtcattggaaa 


1573 


1592 


SEQ ID NO 


: 161 


ccagaactcaagtcttcaa 


1620 


1639 


SEQ ID NO 


: 162 


agtcttcaatcctgaaatg 


1630 


1649 


SEQ ID NO 


: 163 


tgagcaagtgaagaacttt 


1868 


1887 


SEQ ID NO 


: 164 


agcaagtgaagaactttgt 


1870 


1889 


SEQ ID NO 


: 165 


tctgaaagaatctcaactt 


1964 


1983 


SEQ ID NO 


: 166 


actgtcatggacttcagaa 


1986 


2005 


SEQ ID NO. 


167 


acttgacccagcctcagcc 


2051 


2070 


SEQ ID NO: 


168 


tccaaataactaccttcct 


2096 


2115 


SEQ ID NO: 


169 


actaccctcactgcctttg 


2133 


2152 


SEQ ID NO: 


170 


ttggatttgcttcagctga 


2149 


2168 


SEQ ID NO: 


171 


ttggaagctctttttggga 


2211 


2230 


SEQ ID NO: 


172 


ggaagctctttttgggaag 


2213 


2232 


SEQ ID NO: 


173 


tttttcccagacagtgtca 


2238 


2257 


SEQ ID NO: 


174 


agacagtgtcaacaaagct 


2246 


2265 


SEQ ID NO: 


175 


ctttggctataccaaagat 


2321 


2340 


SEQ ID NO: 


176 


caaagatgataaacatgag 


2333 


2352 


SEQ ID NO: 


177 


gatatggtaaatggaataa 


2355 


2374 


SEQ ID NO: 


178 


ggaataatgctcagtgttg 


2367 


2386 


SEQ ID NO: 


179 


tttgaaatccaaagaagtc 


2402 


2421 


SEQ ID NO: 


180 


gatcccccagatgattgga 


2534 


2553 


SEQ ID NO: 


181 


cagatgattggagaggtca 


2541 


2560 


SEQ ID NO: 


182 


agaatgacttttttcttca 


2575 


2594 


SEQ ID NO: 


183 


gaactccccactggagctg 


2619 


2638 


SEQ ID NO: 


184 


atatcttcatctggagtca 


2652 


2671 


SEQ ID NO: 


185 


gtcatlgctcccggagcca 


2667 


2686 



SEQ ID NO 


: 1151 


ttcaattcttcaatgctgt 


10500 


1051S 


1 1 


5 


SEQ ID NO 


: 1152 


agatttgaggattccatca 


7976 


7995 


1 


5 


SEQ ID NO 


: 1153 


caaggagaaactgactgct 


6524 


6543 


1 


5 


SEQ ID NO 


: 1154 


tgtagtctcctggtgctgg 


5094 


5113 


1 


5 


SEQ ID NO 


: 1155 


ctggagcttagtaatggag 


8709 


8728 


1 


5 


SEQ ID NO 


: 1156 


icagatgagggaacacatg 


8919 


3 ^ 


1 


5 


SEQ ID NO 


. 1157 


gccaccctggaactctctc 


10869 


10888 


1 


5 


SEQ ID NO 


1158 


Igatcccacctctcattga 


2965 


2984 


1 


5 


SEQ ID NO: 


: 1159 


ctcagggatctgaaggtgg 


8187 


8206 


1 


5 


SEQ ID NO; 


1160 


tcttgagttaaatgctgac 


4979 


4998 


1 


5 


SEQ ID NO: 


1161 


aggtatattcgaaagtcca 


12799 


12818 


1 


5 


SEQ ID NO: 


1162 


caggtatattcgaaagtcc 


12798 


12817 1 


5 


SEQ ID NO: 


1163 


tttcacatgccaaggagaa 


6514 


6533 


1 


5 


SEQ ID NO: 


1184 


ttgaagtgtagtctcctgg 


5088 


5107 


1 


5 


SEQ ID NO: 


1165 


catttctgattggtggact 


7757 


7776 


1 


5 


SEQ ID NO: 


1166 


aaagtgccacttttactca 


6183 


6202 


1 


5 


SEQ ID NO: 


1167 


acaaagtcagtgccctgct 


6007 


6026 


1 


5 


SEQ ID NO: 


1168 


aagtccataatggttcaga 


12811 


12830 


1 


5 


SEQ ID NO: 


1169 


ttctgaatatattgtcagt 


13376 


13395 


1 


5 


SEQ ID NO: 


1170 


ggctcaccctgagagaagt 


12391 


12410 


1 


5 


SEQ ID NO: 


1171 


aggaagatatgaagatgga 


4712 


4731 


1 


5 


SEQ ID NO: 


1172 


caaatttgtggagggtagt 


10319 


10338 


1 


5 


SEQ ID NO: 


1173 


tcagtataagtacaaccaa 


9392 


9411 


1 


5 


SEQ ID NO: 


1174 


tcccgattcacgcttccaa 


11577 


11596 


1 


5 


SEQ ID NO: 


1175 


cttcagaaagctaccttcc 


7929 


7948 


1 


5 


SEQ ID NO: 


1176 


tgaccttctctaagcaaaa 


4876 


4895 


1 


5 


SEQ ID NO: 


1177 


agcttggttttgccagtct 


2458 


2477 


1 


5 


SEQ ID NO: 


1178 


atctcgtgtctaggaaaag 


5968 


5987 


1 


5 


SEQ ID NO: 


1179 


ctcaaggataacgtgtttg 


12609 


12628 


1 


5 


SEQ ID NO: 


1180 


ttatcttattaattatatc 


13079 


13098 


1 


5 


SEQ ID NO: 


1181 


caacacttacttgaattcc 


10661 


10680 1 


5 


SEQ ID NO: 


1182 


gacttcagagaaatacaaa 


11400 


11419 


1 


5 


SEQ ID NO: 


1183 


tccaatttccctgtggatc 


3681 


3700 


1 


5 


SEQ ID NO: 


1184 


tgaccacacaaacagtctg 


5363 


5382 


1 


5 


SEQ ID NO: 


1185 


tgaagtccggattcattct 


11015 


11034 


1 


5 


SEQ ID NO: 


1186 


cagctcaaccgtacagttc 


11861 


11880 1 


5 


SEQ ID NO: 


1187 


tgacttcagtgcagaatat 


11966 


11985 1 


5 


SEQ ID NO: 


1188 


tggccccgtttaccatgac 


5809 


5828 


1 


5 
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SEQ ID NO- 


186 


actaaaatttatcattcct 


2873 




5;fo in NO- 


1 1RQ 


«yycayy L^LLLadyiiCay u 






c: 
o 


SEQ ID NO* 


187 


sttccHccccaaaaaGac 


2886 


2905 


55FO in NO* 


1 1Qn 


y lOlO I l^l^ LOt^r ca ly y CI a L 


1 nATn 


in/IRQ A 

1 u^oy 1 


c; 


SEQ ID NO* 


188 


ctcattaaaaacacicicacit 


2976 


2995 


55FO in Nin- 


1 1Q1 


aoiy ct li/iy <k.»cioy U I L ly ay 


1 •\ T'^ft 
1 1 r OO 


1 1 / / O 1 


O 


SEQ ID NO' 


189 


ttoaacaotattctotcaa 


3142 


3161 


SFO in MO' 


1 1Q9 


L^iyctyotyciay ty luuood 


1 OQQQ 


I Z'f 1 O 1 


O 


SEQ ID NO- 


190 


accttatccaotciaantRf^ 


3285 


3304 


cpn in wn* 


1 1 yjJ 


yyaoy y LdULyicccay yi 




lidoUO 1 


o 


SEQ ID NO- 


191 


nnsntopiantcnpisattpp 






*^Fni in iMo* 




yyciciyyL>ciy<dguiaciyy 


y I^O 


O i CSV '\ 


D 


SEQ ID NO* 


192 




3394 




QFO in wo- 


1 1 


aLLLOULcJaciyii^Ly yaiy I 


111 A7 
1 1 \0( 


^11 RR A 
1 1 1 OD 1 


D 


SEQ ID NO* 


193 


aaaaaatcaaaaatottat 


3463 


348P 


SFD in MO- 


1 1 <k7U 


ata a a r^tn r'a a n QtftHr* 
ea Laa caLu Ly Oddy d L L L L IL? 




i OO 1 y 1 


c 
O 


SEQ ID NO* 


194 


aaatcaaaoatattatttc 


3466 


3485 


SFO in NO- 


1 1Q7 


y ca d d Lr d d ly a L Ldy a LI L 




Q7RA 1 


O 


SEQ ID NO: 


195 


tQQcattataataaaciaaa 


3609 


3628 


SFO ID NO" 


11 98 


f ptPTTTi tn t a f a a I'n a 


1 1 7R1 
1 1 r O 1 


1 1 Rnn 1 


c 
o 


SEQ ID NO* 


196 


aaaaaaaaattaaatttaa 


3622 


3641 


SFO in MO' 


11 QQ 


tpaaaariT'ta r'tn l'r> t r«f+ 
iVi^cdddww Ldwiy i,OL I 




1 02177 1 


o 


SEQ ID NO: 


197 


oaatoacttccaatticcc 


3673 

^^^^ f 


3692 


SEO ID NO" 




y y y a d d d L iLU d L L L 


701 3 


70R9 1 


c 
o 


SEQ ID NO: 


198 


ata acttccaatttcccta 


3675 


3694 


SEQ ID NO* 


1201 


caaantaattamantpat' 

w«yy VLy dLkQViry d^ LOdL 


4Q17 


4Q3R 1 




SEQ ID NO: 


199 


acttccaatttcccta ta a 


3678 


3697 


SEO in NO- 

w^VdC %WJt 


1202 


ppapnaaaaatatnnaarit 
uod^y ddddd Ld Ly y ddy i 


1 O'^RO 




c 
o 


SEQ ID NO* 


200 


aattacaata aa ctcataa 


3803 


3822 


SEO in NO* 

w^VdC ILJ l^\J, 


1203 


ppatpanttpanaf aaapt 




ROnR 1 


c 
o 


SEQ ID NO: 


201 


tttacaaaaccacctcaat 


3860 

www 


3879 


SEO in NO* 


1904 


aftnapptotppattoaaa 
diiydULriy lu^dllUddd 


1 ^^^71 
1 OO / 1 




o 


SEQ ID NO- 


202 


a aan a aa ttca acctccaa 


3884 


3903 


SFO in NO- 




ptnnaattntpattppttp 
wiyy ddiiy iLrdLLLrULLLr 


1 1 79ft 


1 1 7il7 1 


c 

o 


SEQ ID NO: 


203 


acttccacatcccaoaaaa 


3919 


3938 


SEO ID NO* 


1206 


ttttaapaaaanfnnaanf 
LLLLddOddddy i^y ddy i 




AR40 1 


c 
o 


SEQ ID NO: 


204 


ctcttctta aaa aa ca ata 


3939 


3958 


SEO in NO* 


1907 


patpaptnppaaannan ao 
^diwdoiy owdddy y dy dy 


OH-OO 


fl'^n'^ 1 

OOUO I 


c 
o 


SEQ ID NO* 


205 


aaaaacaatGGCcaaatca 


3948 


3967 


SEO ID NO- 


1208 


ly dOLOdL'iW'diiy dllLl 


19fift0 


19RQQ 1 


o 


SEQ ID NO* 


206 


ttccttta ccttttd a taa 


4003 

W W V 


4022 


SEO in NO- 


190Q 


r»/^Qr^QQapa af n csa/^ a o 
v^odOdddOddiy ddy yy dd 


Q9^R 


097*^ 1 


c 

o 


SEQ ID NO- 


207 


caaatctataaaattccat 


4079 


4098 


SFO in NO* 

VJ^VX ILJ IWJ, 


1910 


diyy yddddddurdyycuy 






O 


SEQ ID NO* 


208 


aaatccctacttttaccat 


4117 


4136 


SEO ID NO- 


191 1 


atfinnaon+ataanaai^tt 

diyy yddy ididdyddoii 




HOOO 1 


O 


SEQ ID NO: 


209 


tacctctccta aata ttct 


4159 


4178 


SEO in NO" 

w tLJ IWJ, 


1212 


anaaaaapaaapapanripa 
dydddddOdddOdOdyy wd 


Qfizl3 


QRfi9 1 


O 


SEQ ID NO: 


210 


accaacacaaaccatttca 


4242 


4261 


SEO in NO* 

w^VX ILJ IWJt 


121 3 


tnaanfTitantpf pptnnf" 
i.yddy ly Ldy luiuuiyy L 


\J\JOXj 


'SI OR 1 


c 
O 


SEQ ID NO* 


211 


ccaa cacao accatttcaa 


4243 


4262 


SEO ID NO* 


1214 


ptnaaatapaa+nptptnn 
vLydddidOdd ly^/io lyy 


sJ\J 1 1 




o 


SEQ ID NO: 


212 


actatcata to ata a a tct 


4367 


4386 


SEQ ID NO* 


1215 


anapapptnattttatanf" 
dydOdvviydLiiLd Ldy L 




7Qfi7 1 

f OO f 1 


o 


SEQ ID NO: 


213 


accacaaatatctacttca 


4496 


4515 


SEQ ID NO* 


1216 


taaannptfiaptptntnnt 
LyddyyuiydULv^iy Lyy i 


49 R9 


AROI 1 


c 

o 


SEQ ID NO: 


214 


ccacaaatotctocttcao 


4497 


4516 


SEQ ID NO" 

V^^_VX Ik' l^^^a 


1217 


pta ^^ojPi^f^^ a a ttf ntn n 

wLy dy wddwdddlLLy Lyy 


1031 1 


10330 1 


c 

o 


SEQ ID NO: 


215 


tttaaactccaaaaaoaaa 


4520 


4539 


SEQ ID NO" 


1218 


tttptptpatriattapaaa 

LlLwL^lV^CIlUdllClV/ddd 




Un70^ 1 


o 


SEQ ID NO: 


216 


tcaaaoaaatcaaoattaa 


4552 


4571 


SEQ ID NO- 


1219 


Lv^ddy y d idd^y ly l l ly d 


i9fiin 


19fi9Q 1 


c 

o 


PiFO ID Nn- 


^ 1 / 


a a y oaulaCy a y C ly SC 




I / 


otzU! lU INU. 




9 tcag atattgttgctcat 


101o7 


10206 1 


5 


SEQ ID NO: 


218 


ttaaaatctgacaccaatg 


4818 


4837 


SEQ ID NO: 


1221 


cattcattgaagatgttaa 


7342 


7361 1 


5 


SEQ ID NO: 


219 


gaagtataagaactttgcc 


4838 


4857 


SEQ ID NO: 


1222 


ggcaaatttgaaggacttc 


11994 


12013 1 


5 


SEQ ID NO: 


220 


aagtataagaactttgcca 


4839 


4858 


SEQ ID NO: 


1223 


tggcaaatttgaaggactt 


11993 


12012 1 


5 


SEQ ID NO: 


221 


ttcttcagcctgctttctg 


4941 


4960 


SEQ ID NO: 


1224 


cagaatccagatacaagaa 


6884 


6903 1 


5 


SEQ ID NO: 


222 


ctggatcactaaattccca 


4957 


4976 


SEQ ID NO: 


1226 


tgggtctttccagagccag 


11033 


11052 1 


5 


SEQ ID NO: 


223 


aaattaatagtggtgctca 


5014 


5033 


SEQ ID NO: 


1226 


tgag aag ccccaagaattt 


6248 


6267 1 


5 
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SEQ ID NO: 224 


agtgcaacgaccaacttg a 


5073 


5092 


SEQ ID NO: 1227 


tcaaattcctggatacact 


9848 


9867 


1 


5 


SEQ ID NO: 225 


ctggg aaqtgcttatcagg 


5238 


5257 


SEQ ID NO: 1228 


cctgaccttcacataccag 


8310 


8329 


1 


5 


SEQ ID Nb: 226 


gcaaaaacattttcaactt 


5278 


5297 


SEQ ID NO: 1229 


aagtaaaagaaaattttgc 


10744 


10763 


1 


5 


SEQ ID NO: 227 


aaaaacattttcaacttca 


5280 


5299 


SEQ ID NO: 1230 


tgaagtaaaagaaaatttt 


10742 


10761 


1 


5 


SEQ ID NO: 228 


tcagtcaag aag g acttaa 


5302 


5321 


SEQ ID NO: 1231 


ttaag gacttccattctga 


13363 


13382 


1 


5 


SEQ ID NO: 229 


tcaaatg acatg atg ggct 


5325 


5344 


SEQ ID NO: 1232 


agcccatcaatatcattga 


6205 


6224 


1 


5 


SEQ ID NO: 230 


cacacaaacag tctg aaca 


5367 


5386 


SEQ ID NO: 1233 


tgtttcaactgcctttgtg 


11219 


11238 


1 


5 


SEQ ID NO: 231 


tcttcaaaacttgacaaca 


5409 


5428 


SEQ ID NO: 1234 


tgttttcctatttccaaga 


12835 


12854 


1 


5 


SEQ ID NO: 232 


caagttttataagcaaact 


5441 


5460 


SEQ ID NO: 1235 


agttattttgctaaacttg 


14043 


14062 


1 


5 


SEQ ID NO: 233 


tggtaactactttaaacag 


5488 


5507 


SEQ ID NO: 1236 


ctgtttttagaggaaacca 


7512 


7531 


1 


5 


SEQ ID NO: 234 


aacagtgacctgaaataca 


5502 


5521 


SEQ ID NO: 1237 


tgtatagcaaattcctgtt 


5890 


5909 


1 


5 


SEQ ID NO: 235 


qggaaactacggctagaac 


5544 


5563 


SEQ ID NO: 1238 


gttccttccatgatttccc 


10933 


10952 


1 


5 


SEQ ID NO: 236 


aacacatctatgccatctc 


5620 


5639 


SEQ ID NO: 1239 


gagacagcatcttcgtgtt 


11204 


11223 


1 


5 


SEQ ID NO: 237 


tcagcaagctataaagcag 


5652 


5671 


SEQ ID NO: 1240 


ctgctaagaaccttactga 


7780 


7799 


1 


5 


SEQ ID NO: 238 


gcag acactgttgctaagg 


5667 


5686 


SEQ ID NO: 1241 


cctttcaagcactgactgc 


11746 


11765 


1 


5 


SEQ ID NO: 239 


tctg g g g ag aacatactg g 


5866 


5885 


SEQ ID NO: 1242 


ccaggttttccacaccaga 


8038 


8057 


1 


5 


SEQ ID NO: 240 


ttctctcatg attacaaag 


6934 


5953 


SEQ ID NO: 1243 


ctttttcaccaacggagaa 


10838 


10857 


1 


5 


SEQ ID NO: 241 


ctg agcag acag g cacctg 


6034 


6053 


SEQ ID NO: 1244 


caggaggctttaagttcag 


7599 


7618 


1 


5 


SEQ ID NO: 242 


caatttaacaacaatgaat 


6066 


6085 


SEQ ID NO: 1245 


attccttcctttacaattg 


8082 


8101 


1 


5 


SEQ ID NO: 243 


tggacgaactctggctgac 


6140 


6159 


SEQ ID NO: 1246 


gtcagcccagttccttcca 


10924 


10943 


1 


5 


SEQ ID NO: 244 


cttttactcagtg ag ccca 


6192 


6211 


SEQ ID NO: 1247 


tgggctaaacgtatgaaag 


7827 


7846 


1 


5 


SEQ ID NO: 245 


tcattgatgctttagagat 


6217 


6236 


SEQ ID NO: 1248 


atcttcataagttcaatga 


13174 


13193 


1 


5 


SEQ ID NO: 246 


aaaaccaagatgttcactc 


6295 


6314 


SEQ ID NO: 1249 


gagtgaaatgctgtttttt 


8630 


8649 


1 


5 


SEQ ID NO: 247 


aggaatcgacaaaccatta 


6357 


6376 


SEQ ID NO: 1250 


taatg attttcaagttcct 


8294 


8313 


1 


5 


SEQ ID NO: 248 


tagttgtactggaaaacgt 


6376 


6395 


SEQ ID NO: 1251 


acgttagcctctaagacta 


11928 


11947 


1 


5 


SEQ ID NO: 249 


ggaaaacgtacagagaaag 


6386 


6405 


SEQ ID NO: 1252 


cttttacaattcattttcc 


13014 


13033 1 


5 


SEQ ID NO: 250 


gaaaacgtacagagaaagc 6387 


6406 


SEQ ID NO: 1253 


g ctttctcttccacatttc 


10052 


10071 


1 


5 


SEQ ID NO: 251 


aaagctgaagcacatcaat 


6401 


6420 


SEQ ID NO: 1254 


attgatgttag agtgcttt 


6984 


7003 


1 


5 


SEQ ID NO: 252 


aagctgaagcacatcaata 


6402 


6421 


SEQ ID NO: 1255 


tattgatgttagagtgctt 


6983 


7002 


1 


5 


SEQ ID NO: 263 


tgaagcacatcaatattga 


6406 


6425 


SEQ ID NO: 1256 


tcaaccttaatgattttca 


8287 


8306 


1 


5 


SEQ ID NO: 254 


atcaatattgatcaatttg 


6414 


6433 


SEQ ID NO: 1257 


caaagccatcactgatgat 


1660 


1679 


1 


5 


SEO ID NO* 255 


taatgattatctgaattca 


6476 


6495 


SEQ ID NO" 1258 


ta aaatcattaaaaaatta 


6719 


6738 


1 


5 


SEQ ID NO: 256 


gattatctgaattcattca 


6480 


6499 


SEQ ID NO: 1259 


tgaagtagctgagaaaatc 


7094 


7113 


1 


5 


SEQ ID NO: 257 


aattgggagagacaagttt 


6498 


6517 


SEQ ID NO: 1260 


aaacattcctttaacaatt 


9488 


9507 


1 


5 


SEQ ID NO: 258 


aaaatagctattgctaata 


6693 


6712 


SEQ ID NO: 1261 


tattgaaaatattgatttt 


6806 


6825 


1 


5 


SEQ ID NO: 259 


aaaattaaaaagtcttgat 


6731 


6750 


SEQ ID NO: 1262 


atcatatccgtgtaatttt 


6757 


6776 


1 


5 


SEQ ID NO: 260 


ttg aaaatattg attttaa 


6808 


6827 


SEQ ID NO: 1263 


ttaatcttcataag ttcaa 


13171 


13190 


1 


5 


SEQ ID NO: 261 


agacatccagcacctagct 


6938 


6957 


SEQ ID NO: 1264 


agcttggttttgccagtct 


2458 


2477 


1 


5 
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SEQ ID NO- 262 


caatttcattta a aaa a at 


7021 

ff k^ I 


7040 


SEQ ID NO' 263 


aaattttaatqqataaatt 


7174 


7193 


SEQ ID NO* 264 


caaaaactaaacaatatcc 


7233 


7252 


SEQ ID NO- 265 


taaa ataaaaaattacttt 


7262 


7281 


SEQ ID NO' 266 


aaaaattactttciiaaaaat 


7269 


7288 


SEQ ID NO' 267 


aaaaaattaattoaattta 


7281 


7300 


SEQ ID NO' 268 


atttattqatqatqctqtc 


7295 


7314 


SEQ ID NO' 269 


a aattatcttttaaa acat 


7326 


7345 


SEQ ID NO- 270 


ttaccaccaotttqtaaat 


7403 


7422 


SEQ ID NO* 271 


ttacaatatatctaaaaao 


7540 


7559 


SEQ ID NO' 272 


cattcaocaQQaacttcaa 


7691 


7710 


SEQ ID NO" 273 


acaccta attttataatcc 


7950 


7969 


SEQ ID NO' 274 


aaattccatcaattcaaat 


7984 


8003 


SEQ ID NO* 275 


ttataaaaataaaaataaa 


8104 


8123 


SEQ ID NO" 276 


rttaaacaataaQctacaot 


8148 


8167 


SEQ ID NO* 277 


aatccaatctcctcttttc 


8399 


8418 

k^ 1 ■ 


SEQ ID NO' 278 


attttaattttcaaacaaa 


8524 


8543 


SEQ ID NO* 279 


tttta attttcaaa caaat 

k k k k29 kM k k h k^^ krf kM k^ k^ k 


8525 


8544 


SEQ ID NO* 280 


taattttcaaocaaataca 


8528 


8547 


SEQ ID NO* 281 


ata etc tttttto q aa at q 

k^ 55 k kh k>M 2J k^kM kMk29 


8637 


8656 


SEQ ID NO- 282 


to eta tttttta a aaatq c 


8638 


8657 


SEQ ID NO' 283 


aaaaaaatacactqqaqct 


8698 


8717 


SEQ ID NO* 284 


acta aaa ctt aq taatq q a 

kiAWk^ ^^ki*y WkkkiM>^ kk^k^ky 


8708 


8727 


SEQ ID NO" 285 


cttctaaaaaaoQatcatq 

V/kbk.^ k^^^k^k^kii«k<«^^^ 


8878 


8897 


SEQ ID NO; 286 


q a aaaaqqqtcatqq aaat 


8883 


8902 


SEQ ID NO' 287 


aaacctaccccaaattctc 


8902 


8921 


SEQ ID NO" 288 


ttctcaqatqaqqqaacac 


8916 


8935 


SEQ ID NO' 289 


q atq aqq qaacacatqaat 


8922 


8941 


SEQ ID NO: 290 


ctttggactgtccaataag 


8978 


8997 


SEQ ID NO' 291 


acatccacaaacaatqaaq 


9252 


9271 


SEQ ID NO* 292 


cacaaacaatqaaqqqaat 

%M %^i»23 *^^^^23 29 29 * 


9257 


9276 




ccaa aainciCTy cig ga 






SEQ ID NO: 294 


caaaatttctctg c tg g a a 


9408 


9427 


SEQ ID NO: 295 


tctgctggaaacaacgaga 


9417 


9436 


SEQ ID NO: 296 


ctg ctggaaacaacg ag aa 


9418 


9437 


SEQ ID NO: 297 


agaacattatggaggccca 


9433 


9452 


SEQ ID NO: 298 


agaagcaaatctggatttc 


9467 


9486 


SEQ ID WO: 299 


tttctctctatgggaaaaa 


9557 


9576 



SEQ ID NO: 1265 


attccttcctttacaattg 

k^ 


8082 


8101 


1 


5 


SEQ ID NO: 1266 


aattgttgaaagaaaacct 


13147 


13166 1 


5 


SEQ ID NO: 1267 


ggacaaggcccagaatctg 

k^ k^ k^ %^ 


12545 


12564 


1 


5 


SEQ ID NO: 1268 


aaag aaaacctatgcctta 


13155 


13174 


1 


5 


SEQ ID NO: 1269 


atttcttaaacattccttt 


9481 


9500 


1 


5 


SEQ ID NO: 1270 


taaagccattoagtctcto 


12952 


12981 


1 


5 


SEQ ID NO: 1271 


gacatgttgataaagaaat 


7371 


7390 


1 


5 


SEQ ID NO: 1272 


atgtatcaaatggacatlc 


7677 


7696 


1 


5 


SEQ ID NO: 1273 


atctggaaccttgaagtaa 


10731 


10750 


1 


5 


SEQ ID NO: 1274 


cttttcacattagatgcaa 


8412 


8431 


1 


5 


SEQ ID NO: 1275 


ttgaaggacttcaggaatg 


12001 


12020 


1 


5 


SEQ ID NO: 1276 


ggactcaaggataacgtgt 


12606 


12625 


1 


5 


SEQ ID NO: 1277 


atcttcaatgattatatcc 


13116 


13135 


1 


5 


SEQ ID NO: 1278 


tttatgattatgtcaacaa 


12352 


12371 


1 


5 


SEQ ID NO: 1279 


actggacttctctagtcag 


8801 


8820 


1 


5 


SEQ ID NO: 1280 


g aaaaatgaagtccgg att 

ks' k^ k^ k^ 


11009 


11028 


1 


5 


SEQ ID NO: 1281 


tttgcaaqttaaagaaaat 


14015 


14034 


1 


6 


SEQ ID NO: 1282 


atttgatttaagtgtaaaa 


9614 


9633 


1 


5 


SEQ ID NO: 1283 


tgcaagttaaagaaaatca 


14017 


14036 


1 


5 


SEQ ID NO: 1284 


cattggtaggagacagcat 


11195 


11214 


1 


5 


SEQ ID NO: 1285 


gcattggtaggagacagca 


11194 


11213 


1 


5 


SEQ ID NO: 1286 


ag ctag agg g cctcttttt 


10825 


10844 


1 


5 


SEQ ID NO: 1287 


tccactcacatcctccagt 


1281 


1300 


1 


5 


SEQ ID NO: 1288 


catg aaccc ctacatg aag 


13751 


13770 


1 


5 


SEQ ID NO: 1289 


atttgaaagttcgttttcc 


9274 


9293 


1 


5 


SEQ ID NO: 1290 


gagaacattatggaggccc 


9432 


9451 


1 


5 


SEQ ID NO: 1291 


gtgtcttcaaagctgagaa 


12408 


12427 


1 


5 


SEQ ID NO: 1292 


attccagcttccccacatc 


8330 


8349 


1 


5 


SEQ ID NO: 1293 


cttatgggatttcctaaag 


11159 


11178 


1 


5 


SEQ ID NO: 1294 


cttcatctgtcattgatgc 


10219 


10238 


1 


5 


SEQ ID NO: 1295 


attccctgaagttgatgtg 


11480 


11499 


1 


5 


SEQ ID NO* 1296 


tccatcacaaatcctttqq 


9663 


9682 


1 


5 


SEQ ID NO: 1297 


ttccatcacaaatcctttg 


9662 


9681 


1 


5 


SEQ ID NO: 1298 


tctcaag ag ttacag ca g a 


13221 


13240 


1 


5 


SEQ ID NO: 1299 


ttctcaagagttacagcag 


13220 


13239 


1 


5 


SEQ ID NO: 1300 


tgggcctgccccagattct 


8901 


8920 


1 


5 


SEQ ID NO: 1301 


gaaatcttcaatttattct 


13813 


13832 


1 


5 


SEQ ID NO: 1302 


tttttgcaagttaaagaaa 


14013 


14032 1 


5 



261 



wo 2004/080406 



PCT/US2004/007070 



SEQ ID NO: 


300 


tcagagcatcaaatccttt 


9704 


9723 


SEQ ID NO: 


1303 


aaaaaaaatcaaaatctaa 


14025 


14044 

1 w 1 1 


1 

1 


5 


SEQ ID NO: 


301 


cagaaacaatgcattagat 


9743 


9762 


SEQ ID NO: 


1304 


atctatoccatctcttcta 


5625 


5644 


1 

1 


5 


SEQ ID NO: 


302 


tacacattaatcctgccat 


9993 


10012 


SEQ ID NO: 


1305 


atq q aq tctttattqtqta 


14081 


14100 


1 


5 


SEQ ID NO: 


303 


agtcagatattgttgctca 


10186 


10205 


SEQ ID NO: 


1306 


tqaqaactacaaactaact 


4799 


4818 


1 


5 


SEQ ID NO: 


304 


ggagggtagtcataacagt 


10328 


10347 


SEQ ID NO: 


1307 


actq q tq q caa a accctcc 


2726 


2745 


1 


5 


SEQ ID NO: 


305 


caaaagccgaaattccaat 


10396 


10415 


SEQ ID NO- 


1308 


atto aaatacctactttta 


8358 


8377 


1 


5 


SEQ ID NO: 


306 


aaaagccgaaattccaatt 


10397 


10416 


SEQ ID NO: 


1309 


aattgaagtacctactttt 


8357 


8376 


1 


5 


SEQ ID NO: 


307 


ttcaagcaagaacttaatg 


10428 


10447 


SEQ ID NO: 


1310 


cattataqcccttcQtaaa 


13250 


13269 


1 


5 


SEQ ID NO: 


308 


cctcttacttttccattga 


10570 


10589 


SEQ ID NO: 


1311 


tcaaaaqaaqcccaaqaqq 


12939 


12958 


1 


5 


SEQ ID NO: 


309 


tgaggccaacacttacttg 


10655 


10674 


SEQ ID NO: 


1312 


caagcatctgattgactca 


12668 


12687 1 


5 


SEQ ID NO: 


310 


cacttacttg aattccaag 


10664 


10683 


SEQ ID NO: 


1313 


cttgaacacaaagtcagtg 


6000 


6019 


1 


5 


SEQ ID NO: 


311 


gaagtaaaagaaaattttg 


10743 


10762 


SEQ ID NO: 


1314 


caaaaacattttcaacttc 


5279 


5298 


1 


5 


SEQ ID NO: 


312 


cctggaactctctccatgg 


10874 


10893 


SEQ ID NO: 


1315 


ccatttacaa atcttcaa a 


11364 


11383 


1 


5 


SEQ ID NO: 


313 


^gctggatgtaaccaccag 


11176 


11195 


SEQ ID NO: 


1316 


eta a attccacata ca a ct 


1 1847 


11865 


1 


5 


SEQ ID NO: 


314 


aaaattccctgaagttgat 


11477 


11496 


SEQ ID NO: 


1317 


atcata tccq tq taatttt 


6757 


6776 


1 


5 


SEQ ID NO: 


315 


cagatggcattgctgcttt 


11605 


11624 


SEQ ID NO: 


1318 


aaaqctqaqaaqaaatcta 


12416 


12435 


1 


5 


SEQ ID NO: 


316 


agatggcattgctgctttg 


11606 


11625 


SEQ ID NO: 


1319 


caaaQctQaaaaoaaatct 


12415 


12434 


1 


5 


SEQ ID NO: 


317 


tgttgaaacagtcctggat 


11834 


11853 


SEQ ID NO: 


1320 


atccaaaataaaatcaaca 


13095 


13114 


1 


5 


SEQ ID NO: 


318 


catattcaaaactgagttg 


12221 


12240 


SEQ ID NO: 


1321 


caactctctaattactata 


13623 


13642 


1 


5 


SEQ ID NO: 


319 


aaagatttatcaaaagaag 


12930 


12949 


SEQ ID NO: 


1322 


cttcaatttattcttcttt 


13818 


13837 


1 


5 


SEQ ID NO: 


320 


attttccaactaatagaag 


13026 


13045 


SEQ ID NO: 


1323 


cttcaaaqacttaaaaaat 


8006 


8025 


1 


5 


SEQ ID NO: 


321 


aattatatccaagatgag a 


13089 


13108 


SEQ ID NO: 


1324 


tctcttcctccatq a aatt 


10471 


10490 


1 


5 


SEQ ID NO: 


322 


ttcaggaagcttctcaaga 


13210 


13229 


SEQ ID NO: 


1325 


tcttcataagttcaatg aa 


13175 


13194 


1 


5 


SEQ ID NO: 


323 


ttgagcaatttctgcacag 


13429 


13448 


SEQ ID NO: 


1326 


ctgttgaaag atttatcaa 


12924 


12943 


1 


5 


SEQ ID NO: 


324 


ctgatatacatcacgg agt 


13704 


13723 


SEQ ID NO: 


1327 


actcaatg g tg aaattca g 


7457 


7476 


1 


5 


SEQ ID NO: 


325 


acatcacggagttactgaa 


13711 


13730 


SEQ ID NO: 


1328 


ttcagaagctaagcaatgt 


7231 


7250 


1 


5 


SEQ ID NO: 


326 


actgcctatattgataaaa 


13874 


13893 


SEQ ID NO: 


1329 


ttttggcaagctatacagt 


8372 


8391 


1 


5 


SEQ ID NO: 


327 


aggatggcattttttgcaa 


14003 


14022 


SEQ ID NO: 


1330 


ttgcaagcaagtctttcct 


3005 


3024 


1 


5 


SEQ ID NO: 


328 


ttttttgcaagttaaagaa 


14012 


14031 


SEQ ID NO: 


1331 


ttctctctatgggaaaaaa 


9558 


9577 


1 


5 


SEQ ID NO: 


329 


tccagaactcaagtcttca 


1619 


1638 


SEQ ID NO: 


1332 


tg aaatgctgttttttg g a 


8633 


8652 


3 


4 


SEQ ID NO: 


330 


agttagtgaaagaagttct 


1948 


1967 


SEQ ID NO: 


1333 


agaatctgtaccaggaact 


12556 


12575 3 


4 


SEQ ID NO: 


331 


atttacaactctaacaaat 


5427 


5446 


SEQ ID NO* 


1334 


O wLLv/ClM ClUCIClCILClwClClCl I 


1 1401 


11420 3 




SEQ ID NO: 


332 


gattatctgaattcattca 


6480 


6499 


SEQ ID NO: 


1335 


tgaaaccaatgacaaaatc 


7421 


7440 


3 


4 


SEQ ID NO: 


333 


gtgcccttctcggttgctg 


18 


37 


SEQ ID NO: 


1336 


cagctgagcagacaggcac 


6031 


6050 


2 


4 


SEQ ID NO: 


334 


attcaagcacctccggaag 


245 


264 


SEQ ID NO: 


1337 


cttcataagttcaatgaat 


13176 


13195 2 


4 


SEQ ID NO: 


335 


gactgctgattcaagaagt 


308 


327 


SEQ ID NO: 


1338 


acttcccaactctcaagtc 


13407 


13426 2 


4 


SEQ ID NO: 


336 


ttgctgcagccatgtccag 


475 


494 


SEQ ID NO: 


1339 


ctgggcagctgtatagcaa 


5881 


5900 


2 


4 


SEQ ID NO: 


337 


agaaagatgaacctactta 


547 


566 


SEQ ID NO: 


1340 


taagtatg atttcaattct 


10490 


10509 2 


4 
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SEQ ID NO: 


338 


tgaagactctccaggaact 


1087 


1106 


SEQ ID NO: 1341 


agttcaatgaatttattca 


13183 


13202 2 


4 


SEQ ID NO: 


339 


atctctcttg ccacagctg 


1202 


1221 


SEQ ID NO: 1342 


cagcccagccatttgagat 


9229 


9248 2 


4 


SEQ ID NO: 


340 


tctctcttgccacagctga 


1203 


1222 


SEQ ID NO: 1343 


tcagcccagccatttgaga 


9228 


9247 2 


4 


SEQ ID NO: 


341 


tgaggtatccaqccccatc 


1223 


1242 


SEQ ID NO: 1344 


gatsggaaagccgccctca 


5208 


5227 2 


4 


SEQ ID NO: 


342 


ccagaactcaagtcttcaa 


1620 


1639 


SEQ ID NO: 1345 


ttgaaagcagaacctctgg 


5907 


5926 2 


4 


SEQ ID NO: 


343 


ctgaaaaagtlagtgaaag 


1941 


1960 


SEQ ID NO: 1346 


ctttctcgggaatattcag 


10623 


10642 2 


4 


SEQ ID NO: 


344 


tttttcccagacaqtgtca 


2236 


2257 


SEQ ID NO: 1347 


tgacaggcaitttgaaaaa 


9722 


9741 2 


4 


SEQ ID NO: 


345 


ttttcccagacagtgtcaa 


2239 


2258 


SEQ ID NO: 1348 


ttgacaggcattttgaaaa 


9721 


9740 2 


4 


SEQ ID NO: 


346 


cattcagaacaagaaaatt 


3395 


3414 


SEQ ID NO: 1349 


aattccaattttqaqaatq 


10406 


10425 2 


4 


SEQ ID NO: 


347 


tgaagagaagattgaattt 


3620 


3639 


SEQ ID NO: 1350 


aaatgtcagctcttgttca 


10894 


10913 2 


4 


SEQ ID NO: 


348 


tttgaatggaacacaggca 


3636 


3655 


SEQ ID NO: 1351 


tgccagtttgaaaaacaaa 


118Q7 


11826 2 


4 


SEQ ID NO: 


349 


ttctagattcgaatatcaa 


4399 


4418 


SEQ ID NO: 1352 


ttgacatgttgataaagaa 


7369 


7388 2 


4 


SEQ ID NO: 


350 


gattcgaatatcaaattca 


4404 


4423 


SEQ ID NO: 1353 


tgaagtagaccaacaaatc 


7154 


7173 2 


4 


SEQ ID NO: 


351 


tgcaacgaccaacttgaag 


5075 


5094 


SEQ ID NO: 1354 


cttcag g ttccatcg tg ca 


11376 


11395 2 


4 


SEQ ID NO: 352 


ttaag ctctcaaatg acat 


5317 


5336 


SEQ ID NO: 1355 


atgttgataaagaaattaa 


7374 


7393 2 


4 


SEQ ID NO: 


353 


caatttaacaacaatgaat 


6066 


6085 


SEQ ID NO: 1356 


attcaaactgcctatattg 


13868 


13887 2 


4 


SEQ ID NO: 354 


tgaatacagccaggacttg 


6080 


6099 


SEQ ID NO: 1357 


caagagcacacggtcttca 


10679 


10698 2 


4 


SEQ ID NO: 355 


catcaatattgatcaattt 


6413 


6432 


SEQ ID NO: 1368 


aaattccctgaagttgalg 


11478 


11497 2 


4 


SEQ ID NO: 


356 


ttgagcatgtcaaacactt 


7051 


7070 


SEQ ID NO: 1359 


aagtaagtgctaggttcaa 


9373 


9392 2 


4 


SEQ ID NO: 357 


tgaaggagactattcagaa 


7219 


7238 


SEQ ID NO: 1360 


ttctgcacagaaatattca 


13438 


13457 2 


4 


SEQ ID NO: 358 


ttcaggctcttcagaaagc 


7921 


7940 


SEQ ID NO: 1361 


gcttgctaacctctctgaa 


12304 


12323 2 


4 


SEQ ID NO: 


359 


tocacaaattg aacatccc 


8779 


8798 


SEQ ID NO: 1362 


gggacctaccaagagtgga 

www W W WW 


12525 


12544 2 


4 


SEQ ID NO: 


360 


tgaataccaatgctgaact 


10159 


10178 


SEQ ID NO: 1363 


agttcaatgaatttattca 

w w 


13183 


13202 2 


4 


SEQ ID NO: 


361 


taaactaatagatgtaatc 


12890 


12909 


SEQ ID NO: 1364 


gattactatgaaaaattta 


13632 


13651 2 


4 


SEQ ID NO: 


362 


ttgacctgtccattcaaaa 


13672 


13691 


SEQ ID NO: 1365 


ttttaaaag aaatcttcaa 


13805 


13824 2 


4 


SEQ ID NO: 


363 


gggctgagtgcccttctcg 


11 


30 


SEQ ID NO: 1366 


cgaggccaggccgcagccc 

W WW w w 


76 


95 1 


4 


SEQ ID NO: 


364 


ggctgagtgcccttctcgg 


12 


31 


SEQ ID NO: 1367 


ccgaggccaggccgcagcc 


75 


94 1 


4 


SEQ ID NO: 


365 


ctgagtgcccttctcggtt 


14 


33 


SEQ ID NO: 1368 


aaccgtgcctgaatctcag 


11549 


11568 1 


4 


SEQ ID NO: 


366 


tcfccggttgctgccgctga 


25 


44 


SEQ ID NO: 1369 


tcag ctg acctcatcgag a 


2160 


2179 1 


4 


SEQ ID NO: 


367 


caggccgcagcccaggagc 


82 


101 


SEQ ID NO: 1370 


gctctgcagcttcatcctg 


368 


387 1 


4 


SEQ ID NO: 368 


gctggcgctgcctgcgctg 


143 


162 


SEQ ID NO: 1371 


cagcacagaccatttcagc 


4244 


4263 1 


4 


SEQ ID NO: 


369 


tartnrtanf^ancicacpaa 
y-'y y y ^ y ^ *«' « y 


169 


188 

1 w w 


SEQ ID NO* 1372 


ctaaatataaccaccaaca 


11178 


11197 1 

III w f 1 


4 


SEQ ID NO: 


370 


ctggtctgtccaaaagatg 


219 


238 


SEQ ID NO: 1373 


catcctgaagaccagccag 


380 


399 1 


4 


SEQ ID NO: 


371 


ctgagagttccaglggagt 


283 


302 


SEQ ID NO: 1374 


actcaccctggacattcag 


3383 


3402 1 


4 


SEQ ID NO: 


372 


tccagtggagtccctggga 


291 


310 


SEQ ID NO: 1375 


tcccggagccaaggctgga 


2675 


2694 1 


4 


SEQ ID NO: 373 


aggttgagctggaggttcc 


346 


365 


SEQ ID NO: 1376 


ggaaccctctccctcacct 


4728 


4747 1 


4 


SEQ ID NO: 


374 


tgagctggaggttccccag 


350 


369 


SEQ ID NO: 1377 


ctgggaggcatgatgctca 


9163 


9182 1 


4 


SEQ ID NO: 


375 


tctgcagcttcatcctgaa 


370 


389 


SEQ ID NO: 1378 


ttcaaatataatcggcaga 


3261 


3280 1 


4 
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SEQ ID NO: 


376 


gccagtgcaccctgaaaga 


394 


413 


SEQ ID NO: 


377 


ctctgaggagtttgctgca 


464 


483 


SEQ ID NO: 


378 


ag g tatg ag ctcaag ctg g 


492 


511 


SEQ ID NO: 


379 


tcctttacccggagaaaga 

^^^^ 


535 


554 


SEQ ID NO; 


380 


catcaagaggggcatcatt 


575 


594 


SEQ ID NO: 


381 


tcctg gttcccccag agac 


601 


620 


SEQ ID NO: 


382 


aagaagccaagcaagtgtt 


622 


641 


SEQ ID NO: 


383 


aagcaagtgttgtttctgg 


630 


649 


SEQ ID NO: 


384 


tctggataccgtgtatgga 


644 


663 


SEQ ID NO: 


385 


ccactcactttaccgtcaa 


670 


689 


SEQ ID NO: 


386 


aggaagggcaatgtggcaa 


693 


712 


SEQ ID NO: 


387 


gcaatgtggcaacagaaat 


700 


719 


SEQ ID NO: 


388 


caatgtggcaacagaaata 


701 


720 


SEQ ID NO: 


389 


tggcaacagaaatatccac 


706 


725 


SEQ ID NO: 


390 


agagacctgggccagtgtg 


729 


748 


SEQ ID NO: 


391 


tgtgatcgcttcaagccca 


744 


763 


SEQ ID NO: 


392 


gtgatcgcttcaagcccat 


745 


764 


SEQ ID NO: 


393 


cagcccacttgctctcatc 


776 


795 


SEQ ID NO: 


394 


gctctcatcaaaggcatga 


786 


805 


SEQ ID NO: 


395 


ccttgtcaactctgatcag 


811 


830 


SEQ ID NO: 


396 


cttgtcaactctgatcagc 


812 


831 


SEQ ID NO: 


397 


agccatctgcaaggagcaa 


884 


903 


SEQ ID NO: 


398 


gccatctgcaaggagcaac 


885 


904 


SEQ ID NO: 


399 


cttcctgcctttctcctac 


908 


927 


SEQ ID NO: 400 


ctttctcctacaagaataa 


916 


935 


SEQ ID NO: 


401 


gatcaacagccgcttcttt 


989 


1008 


SEQ ID NO: 402 


atcaacagccgcttctttg 


990 


1009 


SEQ ID NO: 


403 


acagccgcttctttggtga 


994 


1013 


SEQ ID NO: 


404 


aagatgggcctcgcatttg 


1023 


1042 


SEQ ID NO: 


405 


tgttttgaagactctccag 


1082 


1101 


SEQ ID NO: 


406 


ttgaagactctccaggaac 


1086 


1105 


SEQ ID NO: 


407 


aactoaaaaaacta accat 


1102 


1121 


SEQ ID NO: 408 


ctgaaaaaactaaccatct 


1104 


1123 


SEQ ID NO: 


409 


aaaact aaccatctctg ag 


1109 


1128 


SEQ ID NO: 


410 


tgagcaaaatatccagaga 


1124 


1143 


SEQ ID NO: 


411 


caataagctggttactgag 


1154 


1173 


SEQ ID NO: 412 


tactgagctgagaggcctc 


1166 


1185 


SEQ ID NO: 


413 


gcctcagtgatgaagcagt 


1180 


1199 



SEQ ID NO: 1379 


tcttccgttctgtaatggc 


5794 


5813 


1 


4 


SEQ ID NO: 1380 


tgcaagaatattttgagag 


6340 


6359 


1 


4 


SEQ ID NO: 1381 


ccagtttccggggaaacct 


12716 


12735 


1 


4 


SEQ ID NO: 1382 


tctttttgggaagcaagga 


2219 


2238 


1 


4 


SEQ ID NO: 1383 


aatggtcaagttcctgatg 


2277 


2296 


1 


4 


SEQ ID NO: 1384 


gtctctgaactcagaagga 


13988 


14007 


1 


4 


SEQ ID NO: 1385 


aacaaataaatggagtctt 


14072 


14091 


1 


4 


SEQ ID NO: 1386 


ccagagccaggtcgagctt 


1 1 042 


11061 


1 


4 


SEQ ID NO: 1387 


tccatgtcccatttacaga 


11356 


11375 


1 


4 


SEQ ID NO: 1388 


ttg attttaacaaaag tgg 


6817 


6836 


1 


4 


SEQ ID NO: 1389 


ttgcaagcaagtctttcct 


3005 


3024 


1 


4 


SEQ ID NO: 1390 


atttccataccccgtttgc 


3480 


3499 


1 


4 


SEQ ID NO: 1391 


lattcttcttttccaattg 


13826 


13846 


1 


4 


SEQ ID NO: 1392 


gtggcttcccatattgcca 


1887 


1906 


1 


4 


SEQ ID NO: 1393 


cacattacatttggtctct 


2930 


2949 


1 


4 


SEQ ID NO: 1394 


tgggaaagccgccctcaca 


5210 


5229 


1 


4 


SEQ ID NO: 1395 


atgggaaagccgccctcac 


5209 


5228 


1 


4 


SEQ ID NO: 1396 


gatgctgaacagtgagctg 


8144 


8163 


1 


4 


SEQ ID NO: 1397 


tcataacagtactgtgagc 


10337 


10356 


1 


4 


SEQ ID NO: 1398 


ctgagtgggtttatcaagg 


12445 


12464 


1 


4 


SEQ ID NO: 1399 


g ctg ag tgg gtttatca ag 


12444 


12463 


1 


4 


SEQ ID NO: 1400 


ttgcaatgagctcatggct 


3805 


3824 


1 


4 


SEQ ID NO: 1401 


gttgcaatgagctcatggc 


3804 


3823 


1 


4 


SEQ ID NO: 1402 


gtaggaataaatggagaag 


9453 


9472 


1 


4 


SEQ ID NO: 1403 


ttattgctgaatccaaaag 


13648 


13667 


1 


4 


SEQ ID NO: 1404 


aaagccatcactgatgatc 


1661 


1680 


1 


4 


SEQ ID NO: 1405 


caaagccatcactgatgat 


1660 


1679 


1 


4 


SEQ ID NO: 1406 


tcacaaatcctttggctgt 


9667 


9686 


1 


4 


SEQ ID NO: 1407 


caaaatagaagggaatctt 


2069 


2088 


1 


4 


SEQ ID NO: 1408 


ctggtaactactttaaaca 


5487 


5506 


1 


4 


SEQ ID NO: 1409 


gttcaatgaatttattcaa 


13184 


13203 


1 


4 


SEQ ID NO: 1410 


atggcattttttgcaagtt 


14006 


14025 


1 


4 


SEQ ID NO: 1411 


agattgatgggcagttcag 


4564 


4583 


1 


4 


SEQ ID NO: 1412 


ctcaaagaatgactttttt 


2570 


2589 


1 


4 


SEQ ID NO: 1413 


tctccagataaaaaactca 


12201 


12220 1 


4 


SEQ ID NO: 1414 


ctcagatcaaagttaattg 


12265 


12284 


1 


4 


SEQ ID NO: 1415 


gagggtagtcataacagta 


10329 


10348 


1 


4 


SEQ ID NO: 1416 


actgttgactcaggaaggc 


12572 


12591 


1 


4 
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bl=Q ID NO: 414 


agtcacatctctcttgcca 


1196 


1215 


SEQ ID NO: 1417 


tggccacatagcatggact 


8858 


8877 


1 


4 


SEQ ID NO: 415 


atctctcttgccacagctg 


1202 


1221 


SEQ ID NO: 1418 


cagctgacctcatcgagat 


2161 


2180 


1 


4 


SEQ ID NO: 416 


tctctcttgccacagctga 


1203 


1222 


SEQ ID NO: 1419 


tcagctgacctcatcgaga 


2160 


2179 


1 


4 


SEQ ID NO: 417 


tgccacagctgattgaggt 


1210 


1229 


SEQ ID NO: 1420 


acctgcaccaaagctggoa 


13955 


13974 


1 


4 


SEQ ID NO: 418 


gccacagctgattgaggtg 


1211 


1230 


SEQ ID NO: 1421 


caccaaaaaccccaaig g c 


11240 


11259 


1 


4 


SEQ ID NO: 419 


tcactttacaagccttggt 


1240 


1259 


SEQ ID NO: 1422 


accagatgctgaacagtga 


8140 


8159 


1 


4 


SEQ ID NO: 420 


cccttctgatagatgtggt 


1324 


1343 


SEQ ID NO: 1423 


accacttacagctagaggg 


10816 


10835 


1 


4 


SEQ ID NO: 421 


gtcacctacctggtggccc 


1341 


1360 


SEQ ID NO: 1424 


gggcgacctaagttgtgac 


3431 


3450 


1 


4 


SEQ ID NO: 422 


ccttgtatgcgctgagcca 


1432 


1451 


SEQ ID NO: 1425 


tggctggtaacotaaaagg 


5578 


5597 


1 


4 


SEQ ID NO: 423 


gacaaaccctacagggacc 


1472 


1491 


SEQ ID NO: 1426 


ggtcctttatgaitatgtc 


12347 


12366 1 


4 


SEQ ID NO: 424 


tgctaattacctgatggaa 


1508 


1527 


SEQ ID NO: 1427 


ttcccaaaagcagtcagca 


9930 


9949 


1 


4 


SEQ ID NO: 425 


tgactgcactggggatgaa 


1538 


1557 


SEQ ID NO: 1428 


ttcaggtccatgcaagtca 


10909 


10928 1 


4 


SEQ ID NO: 426 


actgcactggggatgaaga 


1540 


1559 


SEQ ID NO: 1429 


tcttgaacacaaagtcagt 


5999 


6018 


1 


4 


SEQ ID NO: 427 


atgaagattacacctattt 


1552 


1571 


SEQ ID NO: 1430 


aaatgaaagtaaagatcat 


8110 


8129 


1 


4 


SEQ ID NO: 428 


accatggagcagttaactc 


1602 


1621 


SEQ ID NO: 1431 


g ag taaaccaaaacttg g t 


9016 


9035 


1 


4 


SEQ ID NO: 429 


gcagttaactccagaactc 


1610 


1629 


SEQ ID NO: 1432 


gagttactgaaaaagctgc 


13719 


13738 


1 


4 


SEQ ID NO: 430 


cagaactcaagtcttcaat 


1621 


1640 


SEQ ID NO: 1433 


attggatatccaagatctg 


1925 


1944 


1 


4 


SEQ ID NO: 431 


caggctctgcggaaaatgg 


1695 


1714 


SEQ ID NO: 1434 


ccatg acctccagctcctg 


2477 


2496 


1 


4 


SEQ ID NO: 432 


ccaggaggttcttcttcag 


1730 


1749 


SEQ ID NO: 1435 


ctgaaatacaatgctctgg 


5511 


5530 


1 


4 


SEQ ID NO: 433 


ggttcttcttcagactttc 


1736 


1755 


SEQ ID NO: 1436 


gaaaaacttggaaacaacc 


4431 


4450 


1 


4 


SEQ ID NO: 434 


tttccttgatgatgcttct 


1751 


1770 


SEQ ID NO: 1437 


agaatccagatacaagaaa 


6885 


6904 


1 


4 


SEQ ID NO: 435 


ggagataagcgactggctg 


1773 


1792 


SEQ ID NO: 1438 


cagcatgcctagtttctcc 


9944 


9963 


1 


4 


SEQ ID NO: 436 


gctgcctatcttatgttga 


1788 


1807 


SEQ ID NO: 1439 


tcaatatcaaaagcccagc 


12037 


12056 


1 


4 


SEQ ID NO: 437 


actttgtggcttcccatat 


1882 


1901 


SEQ ID NO: 1440 


atatctggaaccttg aagt 


10729 


10748 


1 


4 


SEQ ID NO: 438 


gccaatatcttgaactcag 


1902 


1921 


SEQ ID NO: 1441 


ctgaactcagaaggatggc 


13992 


14011 


1 


4 


SEQ ID NO: 439 


aatatcttgaactcagaag 


1905 


1924 


SEQ ID NO: 1442 


cttccattctgaatatatt 


13370 


13389 1 


4 


SEQ ID NO: 440 


ctcagaagaattggatatc 


1916 


1935 


SEQ ID NO: 1443 


gataaaagattactttgag 


7265 


7284 


1 


4 


SEQ ID NO: 441 


aagaattg gatatccaaga 


1921 


1940 


SEQ ID NO: 1444 


tcttcaatttattcttctt 


13817 


13836 


1 


4 


SEQ ID NO: 442 


agaattggatatccaagat 


1922 


1941 


SEQ ID NO: 1445 


atcttcaatttattcttct 


13816 


13835 


1 


4 


SEQ ID NO: 443 


tg g atatccaag atctgaa 


1927 


1946 


SEQ ID NO: 1446 


ttcacataccagaattcca 


8317 


8336 


1 


4 


SEQ ID NO: 444 


atatccaagatctgaaaaa 


1930 


1949 


SEQ ID NO: 1447 


tttttaaccagtcagatat 


10177 


10196 1 


4 


SEQ ID NO: 445 


tatccaagatctgaaaaag 


1931 


1950 


SEQ ID NO: 1448 


ctttttaaccagtcagata 


10176 


10195 


1 


4 


SEQ ID NO: 446 


caagatctgaaaaagttag 


1935 


1954 


SEQ ID NO: 1449 


ctaaattcccatggtcttg 


4965 


4984 


1 


4 


SEQ ID NO: 447 


aagatctgaaaaagttagt 


1936 


1955 


SEQ ID NO: 1450 


actaaattcccatggictt 


4964 


4983 


1 


4 


SEQ ID NO: 448 


tgaaaaagttagtgaaaga 


1942 


1961 


SEQ ID NO: 1451 


tctttctcgggaatattca 


10622 


10641 


1 


4 


SEQ ID NO: 449 


tccaactgtcatg gacttc 


1982 


2001 


SEQ ID NO: 1452 


gaagcacatatgaactgga 


13937 


13956 


1 


4 


SEQ ID NO: 450 


tcagaaaattctctcggaa 


1999 


2018 


SEQ ID NO: 1453 


ttcctttaacaattcctg a 


9493 


9512 


1 


4 


SEQ ID NO: 451 


ttccatcactlgacccagc 


2044 


2053 


SEQ ID NO: 1454 


gctgacatagggaatggaa 


8433 


8452 


1 


4 
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SEQ ID NO 


: 452 


cccagcctcagccaaaata 


2057 


2076 


SEO ID NO- 14*5^5 




fo\Z. 


7831 


1 




SEQ ID NO 


: 453 


aocctcaaccaaaataaaa 


2060 


2n7Q 




f O t O O O H"n « j-» ^ f 

llda1.CV/aagalLyggCl 


7814 


7833 


1 


4 


SEQ ID NO 


: 454 


atcttatatttaatccaaa 


2083 


2102 


SEO ID NO- 14^=57 


L L ct CI CI CI CI U CI d d y U ct U 3 1 


1 lolo 


1 1 oo2 


1 


/I 


SEQ ID NO 


: 455 


tcttatatttgatccaaat 


2084 


2103 


SEQ ID NO- 1458 


attttttn na ^nffnasirfs) 

CllLLLlLVJ wClCl^ L LClClClMCl 


i*fU 1 1 




1 


A 


SEQ ID NO 


: 456 


cttcctaaaqaaaacatac 


2109 


2128 


SEO ID NO' l-d'^Q 


y Uca ly y Udlld ly aig alay 


ooUb 




1 


4 


SEQ ID NO 


: 457 


ctaaao aaaci ca to cto aa 


2113 


2132 


SEO in NO" 14.Rn 


L H/ca y y y ly ly g «d g II ca y 


ODOO 


erne 


1 


4 


SEQ ID NO 


: 458 


taaaciaaaQcatactaaaa 


21 14 


2133 


SEO in wo- 


LLlLrllcIclcilUelLlOljplllcl 




yoUl 


1 


4 


SEQ ID NO 


: 459 


g ag atta g cttaq a a o a a a 


2175 


2194 


SEO ID MO' 14B9 




i 1 / U 1 


1 1 / ZXJ 




/I 
4 


SEQ ID NO 


: 460 


ctttqagccaacattQQaa 


2198 


2217 


SEO ID NO* 146"^ 


I ILrOet d ly elU w a a y 3d a a g 


1 1 UDU 






4 


SEQ ID NO 


: 461 


caQacaototcaacaaaac 


2245 


2264 


SEO ID NO- 14fiA 


y uiid ciy y a eg aacicig 


o I o4 


b1 OO 


1 


4 


SEQ ID NO 


: 462 


caatotcaacaaaacttta 


2249 


2268 


SFO in MO- -I^R^ 


caaa uccly gatacacig 


o o /i n 


ybbo 




4 


SEQ ID NO 


: 463 


aatdtcaacaaaactttat 


2250 


2269 


SEO ID NO' 


ctcdayaaiacytciacaci 


400 1 


4of 0 


1 


4 


SEQ ID NO 


: 464 


ctqalqqtqtctctaaaat 


2290 


2309 


SEO ID NO 14fi7 


C3 o P" ir^ Q Q <^ Q ^ t <^ ^ t <a <^ 

duuLuyy ddOddicdcag 




oo44 


1 


4 


SEQ ID NO 


: 465 


taatqatatctctaaaatc 


2291 

A* A« w 1 


2310 


SFO in NO- 14fiR 


gaccigcgcaacgayatca 


oo^o 


oo42 


1 


4 


SEQ ID NO 


: 466 


aaacatqaQcaqaatataa 


2343 


2362 


SEO ID NO- 14fiQ 


UL»d Ly d LU LdUd lliy III 


D / OO 


OOU/ 


A 
1 


4 


SEQ ID NO. 


: 467 


qaaqctqattaaaqatttq 


2387 


2406 


SEQ ID NO* 1470 


OdddddOdLLLLL/ddCLlO 




froQQ 

ozyo 


1 


4 


SEQ ID NO: 


: 468 


aaagatttqaaatccaaao 


2397 


2416 


SEQ ID NO- 1471 


LrLlLddyLLOdyUdLUllL 


/ DUD 


/ OZO 


A 
I 


4 


SEQ ID NO 


469 


qatqaatacccacactcta 


2510 


2529 


SEO ID NO- 1479 


u dy d iiiy ay y a iicca ic 


/ y / o 


/ yy4 


1 


4 


SEQ ID NO; 


470 


q qqatcccccaoatqatta 


2532 


2551 


SEO ID NO* 147*^ 


^ddlUdOddy luy dlLCCC 


oU f o 


yuy4 


\ 


4 


SEQ ID NO: 


471 


ttttcttcactacatcttc 


2585 


2604 


SEO ID NO' 1474 


yddyiyLUdyiyyuaaaaa 




n uoyo 


A 

\ 


4 


SEQ ID NO: 


472 


tcttcactacatcttcatq 


2588 


2607 


SEQ ID NO' 147^ 


udiy y udiLdiydiy ddy a 






A 
\ 


4 


SEQ ID NO: 


473 


tacatcttcatqqaq aata 


2595 


2614 


SEQ ID NO' 1476 


udLLdiyydyycccdiy Id 


9437 


y40D 


A 

\ 


4 


SEQ ID NO: 


474 


ttcatqq aqaatqcctttq 


2601 


2620 


SEQ ID NO- 1477 


/^aaaa'l'oaaoi"l"l'sa tf^^^ 
UddddLoddULllddl^aa 






A 
\ 


4 


SEQ ID NO: 


475 


tcatggagaatgcctttqa 


2602 


2621 


SEQ ID NO* 1478 


iv^adLrdwddiVrLL^dd lyd 


1 O 1 UO 


1 O 1 Z / 


-1 
1 


4 


SEQ ID NO: 


476 


tttgaactccccactqqaq 


2616 


2635 


SEQ ID NO' 1479 


w L^WrLfOdy y dULr IL LUddd 


Oft "^4 




1 


4 


SEQ ID NO: 


477 


ttqaactccccactqqaqc 


2617 


2636 


SEQ ID NO 1480 


y LrLOU^vrdy ydLrULLLUdd 


^OOO 




A 
1 


4 


SEQ ID NO: 


478 


tgaactccccactqqaqct 


2618 


2637 


SEQ ID NO* 1481 


dy ^iLrOUOdyydOvriiiUd 


Q fi 
9000 


yoo 1 


1 


4 


SEQ ID NO: 


479 


cactgg agctq qattacaq 


2627 


2646 


SEQ ID NO" 1482 


w ly LLL^ Ly dy Looody ly 




1 


4 


SEQ ID NO: 


480 


actggaqctqqattacaqt 


2628 


2647 


SEQ ID NO* 1483 


dt^Ly LLLLr Ly dy LOL^ody i 






1 


A 

4 


SEQ ID NO: 


481 


agttq caaatatcttcatc 


2644 


2663 


SEQ ID NO- 1484 


y d Ly d ly UUdddd lUddU I 




DDIU 


i 

1 


4 


SEQ ID NO: 


482 


Q ttq ca aatatcttcatct 


2645 


2664 


SEO in NO* 148'=i 


dydiydLyouaaaaicaac 




Dbuy 


1 


4 


SEQ ID NO: 


483 


aaatatcttcatctqqaot 


2650 


2669 

^ w 


SEO ID NO* 14Rfi 


dULUdy ddyy diy y cam 






A 
\ 


4 


SEQ ID NO: 


484 


taaaactaaaaataocpsf^ 






^FO in wo* 14R7 


ngguacaggaggcuta 


/ t)9z 


761 1 


A 
1 


4 


SEQ ID NO: 


485 


ggctgaactggtggcaaaa 


2720 


2739 


SEQ ID NO: 1488 


ttttcttttcag cccagcc 


9220 


9239 


1 


4 


SEQ ID NO: 


485 


tgtggagtttgtgacaaat 


2750 


2769 


SEQ ID NO: 1489 


attttcaagcaaatgcaca 


8530 


8549 


1 


4 


SEQ ID NO: 


487 


ttgtgacaaatatgggcat 


2758 


Till 


SEQ ID NO: 1490 


atgcgtctaccttacacaa 


9513 


9632 


1 


4 


SEQ ID NO: 


488 


atgaacaccaacttcttcc 


2811 


2830 


SEQ ID NO: 1491 


ggaagctgaagtttatcat 


2869 


2888 


1 


4 


SEQ ID NO: 


489 


cttccacgagtcgggtctg 


2825 


2844 


SEQ ID NO: 1492 


cag ag ctatcactg g g aag 


5227 


5246 


1 


4 
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SEQ ID NO: 


490 


oaQtcaaatctaQaaactc 


2832 


2851 


SEQ ID NO* 1493 


aaacttactaoacaaactc 


6132 


6151 


1 

1 


4 


SEQ ID NO: 


491 


cctaaaaactaaaaaacta 


2858 


2877 


SEQ ID NO* 1494 


caacctccccaaccatao a 


12112 


12131 


1 


4 


SEQ ID NO: 


492 


aactaoaaaactaaaattt 


2864 


2883 


SEQ ID NO" 1495 


aaactattaatttacaact 


5455 


5474 


1 

1 


4 


SEQ ID NO: 


493 


ccaaattaQaQCtQcaact 


3106 


3125 


SEQ ID NO' 1496 


aotttccaoaaaaacctaa 


12718 


12737 


1 


4 


SEQ ID NO: 


494 


oaataccctaaaatttata 


3200 


3219 


SEQ ID NO" 1497 


tacaa ta ttcta a a a atcc 


8385 

www 


8404 


1 

1 


4 


SEQ ID NO: 


495 


cto aa a ctaccataa catt 


3244 


3263 


SEQ ID NO 1498 


aatcnaactcpitaanttcpn 

wiwi <MM^'>^dbM » v&bwdv] 


3809 




1 


4 


SEQ ID NO: 


496 


tcitcca 0 to a aotccaa at 


3289 


3308 


SEQ ID NO' 1499 


attttaaaaaasatcDiana 


6349 


6368 

www 


1 
1 


4 


SEQ ID NO: 


497 


aattccaaattttGiatatt 


3305 


3324 


SEQ ID NO' 1500 


aacacataaatcacasiatl 


8930 


8949 


1 
1 


4 


SEQ ID NO: 


498 


ttccaaattttaatattaa 


3307 


3326 


SEQ ID NO" 1501 


tcaaaacaaartiranoaa 


13199 


13?1 8 


1 


4 


SEQ ID NO: 499 


CO a aacaatcctcaaaatt 


3329 


3348 


SEQ ID NO' 1502 


aacttcitacaactaatGca 


4203 


4222 


1 

1 


4 


SEQ ID NO: 


500 


tcotcaoaottaataataa 


3337 


3356 


SEQ ID NO 1503 


tcatcaatta attacaaaa 


7585 


7604 


1 


4 


SEQ ID NO: 


501 


ctcaccctao acattcaaa 


3384 


3403 


SEQ ID NO- 1504 


tct a ca a a a ca ata ct a a a 


12431 


12450 


1 

1 


4 


SEQ ID NO: 


502 


cattcaoaacaaaaaaatt 


3395 


3414 


SEQ ID NO 1505 


aattaactttatao a aatn 


8096 


81 15 


i 

1 


4 


SEQ ID NO: 


503 


a eta aaatcoccctcataa 


3414 


3433 


SEQ ID NO- 1506 


ccatccaaotcaacccacit 


10916 


10935 


1 

* 


4 


SEQ ID NO: 


504 


ttatttccataccccattt 


3478 


3497 


SEQ ID NO 1507 


aaactacctatattaataa 


13872 


13891 


1 


4 


SEQ ID NO: 


505 


QtttQcaaacaciaaciccaQ 


3493 


3512 


SEQ ID NO: 1508 


otaa acttctcttGaaaac 


5400 


5419 


1 


4 


SEQ ID NO: 


506 


tttacaaacaaaaaccaQa 


3494 


3513 


SEQ ID NO 1509 


tctaQatatcaacaacaaa 


5264 


5283 


1 


4 


SEQ ID NO: 


507 


ttacaaacaaaaaccaaaa 


3495 


3514 


SEQ ID NO' 1510 


ttctaaotatcaacaacaa 


5263 


5282 


1 

1 


4 


SEQ ID NO: 


508 


ctQcttctccaaatQQ act 


3546 


3565 


SEQ ID NO: 1511 


aQtcaaQattaatQaacaQ 


4559 


4578 


1 


4 


SEQ ID NO: 


509 


tacta caacttata a ctcc 


3569 


3588 


SEQ ID NO* 1512 


aaaaactttaaattcaaca 


7601 


7620 


1 


4 

r 


SEQ ID NO: 


510 


acaacttataactccacaa 


3573 


3592 


SEQ ID NO' 1513 


cto tataaca a attccta t 


5889 


5908 


1 

1 


4 


SEQ ID NO: 


511 


tttccaaaaaaotaocata 


3592 


3611 


SEQ ID NO' 1514 


cata a acttcttcta aa a a 


8869 


8888 


1 


4 

r 


SEQ ID NO: 


512 


ccaaoaaa atGQcataaca 


3595 


3614 


SEQ ID NO" 1515 


tacccaacaaacaaattaa 


9353 


9372 


1 


4 


SEQ ID NO: 


513 


ataQcatQacattataato 


3603 


3622 


SEQ ID NO- 1516 


catccttaacaccttccac 


8063 


8082 


1 

1 


4 


SEQ ID NO: 


514 


tQataaaoaaaaaattoaa 


3617 


3636 


SEQ ID NO- 1517 


ttcacto ttccto aa atca 


7863 


7882 


1 


4 

4 


SEQ ID NO: 


515 


aaaaaaaaaattaaattta 


3621 


3640 


SEQ ID NO' 1518 


caaaaacattttcaacttc 


5279 


5298 


1 

1 


4 


SEQ ID NO: 


516 


QaQaaQattQaatttoaat 

23 ^^23 ^^Ss 29 w^ ^23 ^ 


3624 


3643 


SEQ ID NO- 1519 


attcataatcccaactctc 


8270 


8289 


1 


4 

4 


SEQ ID NO: 


517 


tttaaata a aacacaa a ca 


3636 

^^^^ 


3655 


SEQ ID NO' 1520 


tacctttotatacaccaaa 


11228 

1 1 ^w' 


11247 


1 


4 


SEQ ID NO: 


518 


aaQcaccaatataQatacc 


3650 


3669 


SEQ ID NO: 1521 


Qataacctaaaaaaaacct 


5583 


5602 


1 


4 

4 


SEQ ID NO: 


519 , 


caaaaaaataacttccaat 


3668 


3687 


SEQ ID NO- 1522 


atta a aa tacctactttta 


8358 

\y V/ 


8377 


1 


4 

r 


SEQ ID NO: 


520 


aaaaaaataacttccaatt 


3669 


3688 


SEQ ID NO: 1523 


aatta aaatacctactttt 


8357 

■ 


8376 


1 

1 


4 


SEQ ID NO: 


521 


aaaaaaiyaciiccaaiu 








aaaiccaaiciccicuiL 




HA'i "7 
0*f 1 / 


1 




SEQ ID NO: 


522 


cag ag tccctcaaacag ac 


3762 


3771 


SEQ ID NO: 1525 


gtctgtgggattccatctg 


4082 


4101 


1 


4 


SEQ ID NO: 523 


aaattaatagttgcaatga 


3795 


3814 


SEQ ID NO: 1526 


tcataagttcaatgaattt 


13178 


13197 


1 


4 


SEQ ID NO: 


524 


ttcaacctccagaacatgg 


3891 


3910 


SEQ ID NO: 1527 


ccattgaccagatgctgaa 


8134 


8153 


1 


4 


SEQ ID NO: 525 


tgggattgccagacttcca 


3907 


3926 


SEQ ID NO: 1528 


tg gaaatgg g cotgcccca 


8895 


8914 


1 


4 


SEQ ID NO: 


526 


cagtttgaaaattgagatt 


3986 


4005 


SEQ ID NO: 1529 


aatcacaactcctccactg 


9533 


9552 


1 


4 


SEQ ID NO: 527 


gaaaattgagattcctttg 


3992 


4011 


SEQ ID NO: 1530 


caaaactaccacacatttc 


13686 


13705 


1 


4 
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SEQ ID NO* 


528 


ttto cctttto a to 0 caaa 

hh^^ WW fchkh^^ WKil«« 


4007 

■ WW 1 


4026 




ID NO- 1531 

1 INVa/a l%Jw 1 




OOO 1 


OO r U 1 


4 


SEQ ID NO* 


529 


ctccaoaaatctaaaaata 


4028 


4047 


^_ Vjc 


ID MO' I'S'^P 


r*ato5iaftnnt+ar»annan 
wci i^dd L y LLdUciy y ci y 


f OOO 


r OUO 1 


4 


SEQ ID NO* 


530 


tctaaaaatattaaaoact 


4037 

~ww / 


4056 


SPO 


ID NO' l^'^'^ 

II-/ iNV^. lOOO 


d^l^OllOctiyiUUULdyd 


1 uuzo 


-1 C\r\AA 1 
1 UUH-^ 1 


4 


SEQ ID NO* 


531 


ctctt a aa attccatcta cc 


4084 


4103 


SEO 


ID NO' 1*^34 


nrir'9ti"H'naaaaaaar*an 
yyudlLLiydddddddLrdy 


t ^( 


QTzlft 1 


4 


SEQ ID NO- 


532 


atctaccatctcaaaaatt 


4096 

1^ w wU 


^ J I w 




in Nin- I'S'^^ 


ctdU ddd wl^U Lddy d l 


OOH-O 


OOO r 1 


4 


SEQ ID NO* 


533 


tctca aaaa ttcca aa tec 


4104 


4123 

^ 1 ^w 


SEO 


ID Wn* 153R 

It-/ I^W. l\JOVJ 




O^Ur 


R29R 1 


4 


SEQ ID NO- 


534 


SI a tcccta ctttl accatt 


4118 

1 1 U 




0 1 — Vo£ 


ID MO* 1^'^7 


a a i" n a a t a <^ a <^ *^ a *"i n a ^ fr 
dd ly dd LdLod ^l^Lfd^ydUl 


OU r O 


ouy / 1 


A 


SEQ ID NO* 


535 


acttttaccattccca ao t 


4125 


4144 


SEO 


ID MO- If^'^R 


a 1 1 1 n t a rt a a a + n a Q Q i' 
dULLiy Idyducaiyradidy L 


O 1 U 1 


fti on 1 


4 


SEQ ID NO* 


536 


cattcccaaattatatcaa 


4133 


4152 


SEO 


ID NO- 1539 

lil^^. 1 www 


ttn a a n n af*ttr*a n n a at n 
L d dy y durL LVrfdy y d d 


I9nni 


19090 1 


4 


SEQ ID NO: 


537 


accacataaaaactoactc 


4276 


4295 

~bW W 


SEO 


ID NO- 1540 


nantaaar»r^aaaar»tfnrif 
ydy idddoik/ddddULiy y L 


QniR 

£7V/ 1 o 




4 


SEQ ID NO* 


538 


tttcctacaatcitanaaaa 


4309 

" W W O 


4398 

W^ U 


SEO 


ID MO- 1541 

11-/ 1 1 


^L>llLddiUddUL><j>iydaa 


y*tyo 


Q514 1 

yo 1*+ 1 


4 


SEQ ID NO* 


539 

www 


cl'aaaaaaacaan?)t^1cip 


4330 

*Tw wU 


4349 

't^''tw 


SEO 


ID MO* 1549 


af tr^tfi in /I f ^ Mtr^^ a 

LUdiLCriyyy luiuuodly 


•1 1 no7 




4 


SEQ ID NO* 


540 


atcatataataa citctcta 

<Lii (>wwi ty e>^ <^L\j y y (.w iw Lwi 


4370 


4389 

^ wU C 


SEO 


ID MO- 154'^ 


f ari a af f afsnaaaat r*i 
Ldy dciLLdUdyddcidiydL 


R557 
DOO / 


R57R 1 
OO f O 1 


4 


SEQ ID NO* 


541 

^^^^ t 


catotaatanntfitfitsna 


4372 

^ W f ^ 


4391 


SEO 


ID MD* 1544 


f*ntanrioaf*r*ntnnin^a+n 

Ldy y cduuy ly y y ca ly 


19195 
1^ 1 ^O 


191 AA 1 


A 


SEQ ID NO* 


542 


ttctaaattpaaatatraa 


4399 

■^w%y w 


4418 


SEO 


ID NO- 1545 


ttnat natn^tnt oaanaa 

Liy d ly diy oiy lUddy dd 


/ ouu 


7*^10 1 


/I 
f 


SEQ ID NO* 


543 

wT w 




44Q1 




SFO 


ID MH' •154R 


Udy ddllCCdy ClLCCUUd 




0*JH-O 1 


/I 


SEQ ID NO: 


544 

x^ 1 r 


ctaacacta acca actcaa 


4636 


4655 

~Vw V 


SEQ 


ID NO* 1547 


L ly d y y V* Id iiy diy Lioy 




RQQ5 1 


4 


SEQ ID NO* 


545 


taacactaQccaactcaat 


4637 


4656 


SEO 


ID NO" 1548 


diiy dy y widLiydiy Lid 


RQ75 


fiC)Q4 1 


4 


SEQ ID NO* 


546 

x^ 


aacactaaccaactcaata 


4638 

WW w 


4657 


SEQ 


ID NO* 1549 


pattnaoaf*tattnatnH" 


RQ74 


6QQ3 1 


4 


SEQ ID NO: 


547 


eta Qcca a ctcaata a aa a 


4642 


4661 


SEQ 

\— / ^_wC 


ID NO- 1550 

1 1— ' l^>— /. 1 WW w 


tptppafptnpnptapr'an 

Lw L^^CI L w iU wy WLCIV^V^CIU 


12065 


12084 1 


4 


SEQ ID NO: 


548 


aa ataacao o aa a at ata a 


4705 


4724 


SEQ 


ID NO* 1551 


tcatctcctttcttpatpt 

kwCl Lw w I L LwL Lwwl Lw L 


10202 


-10221 1 


4 


SEQ ID NO: 


549 


tccctcacctccacctcta 


4737 


4756 

^1 ■ WW 


SEQ 


ID NO- 1552 

11—' ItIVii/. 1 WWb 


nanatatatatptpannriffi 
way a iciiiCiLciLw Lwciy y y ci 


81 7R 

O 1 1 u 


8195 1 

U 1 wvf 1 


4 


SEQ ID NO* 


550 


a acta actttaa aatcta a 


4810 


4829 


SEO 

\J ^— Vac 


ID NO- 1553 

ILJ INV_/« 1 www 


tpannpfpttpanaaanpt 
Lwciyy v^LWLiwdy dddy wL 


7Q29 


7Q41 1 

1 Ct i 1 


4 


SEQ ID NO* 


551 


eta acttta a a atcta a ca 


4812 


4831 


SEO 


ID NO- 1554 


tntpaan ataaapaafpan 
iy iv/cidy dLdddLrddLwdy 


8739 


8751 1 


4 


SEQ ID NO- 


552 


caaa ata aatatc accttc 


4865 

^rww w 


4884 


SEO 


ID NO- 1555 

IL-f 1 WWW 


n a antanf aptnnatpttn 
y ciciy Ldy icivLy wdivLLy 




fi854 1 


4 


SEQ ID NO' 


553 


actacattctaaatatcaa 

y wiy wy iiwiy w wic* Lwc(^^ 


4901 


4920 


SEO 


ID NO* 1556 

II—/ INV-/* 1 www 


wiy dy lowOdy ly Uwwdy \/ 




Q'^RI 1 


4 


SEQ ID NO* 


554 

x^ r 


cattctaaatatcaa acta 


4905 

r w \J w 


4924 

~Wfc~ 


SEO 


ID NO* 1557 

11—/ INV_/> 1 f 


Wr«y wciciy Ldwwiy dyddv^y 




8R99 1 


4 


SEQ ID NO* 


555 

x^ x^ 


aattcccataatcttcaat 


4968 

1^ WW w 


4987 

~WW I 


SEQ 


ID NO* 1558 

II—/ 1 w wU 


aptpanafpaaanttaatt 
dwLOdydiwdddy Liddii 




1998*^ 1 


4 


SEQ ID NO* 


556 

V# 


taatcttaaattaaatcct 


4976 


4995 


SEO 


ID NO* 1559 

1 l^>_/> 1 WWW 


ciy wciwCiy Lcioy cicieieidwwd 


1 nsni 

1 VJ (J 1 


10890 1 


4 


SEQ ID NO: 


557 

x^ \^ ■ 


cttaaattaaatactaaca 


4980 


4999 

1 w w w 


SEO 


ID NO* 1560 


ly LwOwridydddLOLwddy 


1 v/wO't 


1005*^ 1 


4 


SEQ ID NO" 


558 

^^^^^^ 


ttaaattaaatantaariat 


4981 


5000 

www w 


SEO 


ID NO- 15fi1 


atntppptanaaatpfpaa 
diy LwUwLdy dddiwiwdd 


1 uuoo 


10059 1 


4 


SEQ ID NO* 


559 

^^^^ 


taanttaaatactaacstn 

Ly ay LLCicmLy wiy ciwciiw 


4982 


5001 


SEO 


ID NO* 1569 


y d ly y ddvrUO lui^ wiUd 


4795 

H- f ^O 


4744 1 


4 






^ /^T^J^ <^ <i ^^4i^T^^ ^^^^^^ 

acugaayiy^^y^^^cci 


DUOD 


OlUt) 


ceo 




aggaaactcagatcaaagt 


12259 


12278 1 


4 


SEQ ID NO: 


561 


agtgtagtctcctggtgct 


5092 


5111 


SEQ 


ID NO: 1564 


agcagccagtggcaccact 


12506 


12525 1 


4 


SEQ ID NO: 


562 


gtgctggagaatgagctga 


5106 


5125 


SEQ 


ID NO: 1565 


tcagccaggtttatagcac 


7726 


7745 1 


4 


SEQ ID NO: 


563 


ctggggcatctatgaaatt 


5143 


5162 


SEQ 


ID NO: 1566 


aatttctg attaccaccag 


13571 


13590 1 


4 


SEQ ID NO: 


564 


atggccgcttcagggasca 


5170 


5189 


SEQ 


ID NO: 1567 


tgttttttggaaatgccat 


8641 


8660 1 


4 


SEQ ID NO: 


555 


ttcagtctg g atg gg aaag 


5199 


5218 


SEQ 


ID NO: 1568 


ctttgacaggcattttgaa 


9719 


9738 1 


4 
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SEQ ID NO: 


566 


ccatgattctgggtgtcga 


5257 


5276 


SEQ ID NO: 


567 


aaaacattttcaacttcaa 


5281 


5300 


SEQ ID NO: 


568 


cttaagctctcaaatgaca 


5316 


5335 


SEQ ID NO: 


569 


ttaagctctcaaatgacat 


5317 


5336 


SEQ ID NO: 


570 


catgatgggctcatatgct 


5333 


5352 


SEQ ID WO: 


571 


tgggctcatstgctgaaat 


5338 


5357 


SEQ ID NO: 


572 


actggacttctcttcaaaa 




5399 


5418 


SEQ ID NO: 


573 


acttctcttcaaaacttga 


5404 


5423 


SEQ ID NO: 


574 


ctgacaagttttataagca 


5437 


5456 


SEQ ID NO: 


575 


aagttttataagcaaactg 


5442 


5461 


SEQ ID NO: 


576 


ctgttaatttacagctaca 


5458 


5477 


SEQ ID NO: 


577 


ttacagctacagccctatt 


5466 


5485 


SEQ ID NO: 


578 


tctggtaactactttaaac 


5486 


5505 


SEQ ID NO: 


579 


tttaaacagtgacctgaaa 


5498 


5517 


SEQ ID NO: 


580 


ttaaacagtgacctgaaat 


5499 


5518 


SEQ ID NO: 


581 


cagtgacctgaaatacaat 


5504 


5523 


SEQ ID NO: 


582 


tgtggctggtaacctaaaa 


5576 


5595 


SEQ ID NO: 


583 


ttatcagcaagctataaag 


5649 


5668 


SEQ ID NO: 


584 


ggttcagggtgtggagttt 


5684 


5703 


SEQ ID NO: 


586 


attcagactcactgcattt 


5767 


5786 


SEQ ID NO: 


586 


ttcagactcactgcatttc 


5768 


5787 


SEQ ID NO: 587 


tacaaatggcaatgggaaa 


5840 


5859 


SEQ ID NO: 


588 


gctgtatagcaaattcctg 


5888 


5907 


SEQ ID NO: 


589 


tgagcagacaggcacctgg 


6035 


6054 


SEQ ID NO: 


590 


ggcacctggaaactcaaga 


6045 


6064 


SEQ ID NO: 


591 


tgaatacagccaggacttg 


6080 


6099 


SEQ ID NO: 


592 


gaatacagccaggacttgg 


6081 


6100 


SEQ ID NO: 


593 


ctggacgaactctggctga 


6139 


6158 


SEQ ID NO: 


594 


ttttactcagtgagcccat 


6193 


6212 


SEQ ID NO: 


595 


gatgagagatgccgttgag 


6233 


6252 


SEQ ID NO: 


596 


aattgttgcttttgtaaag 


6269 


6288 


SEQ ID NO: 


597 


cttttotaaaatataataa 


6277 




SEQ ID NO: 


598 


tttgtaaagtatgataaaa 


6279 


6298 


SEQ ID NO: 


599 


tccattaacctcccatttt 


6312 


6331 


SEQ ID NO: 


600 


ccattaacctcccattttt 


6313 


6332 


SEQ ID NO: 


601 


cttgcaagaatattttg ag 


6338 


6357 


SEQ ID NO: 


602 


agaatattttgagaggaat 


6344 


6363 


SEQ ID NO: 603 


attatagttgtactggaaa 


6372 


6391 



SEQ ID NO 


• 1569 


LuyciL^oclLrclLclOaaaiyg 


CO on 
OOOU 


C)o4» 


1 


4 


SEQ ID NO 


• 1570 


i-y a Ly L Ldy dy ig oiii i 


oyoo 


/ UU4 


■i 
1 


/I 
*+ 


SEQ ID NO 


• 1571 


rOTr^ SI Q Q Q i+Q Q f-i 

iV/ULdUddUcaaguaag 


7*5 /I "7 


72dd 


1 


4 


SEO ID NO 


• 1572 


caiy LuuLoLrdcioaagiiaa 


i ^40 




1 


4 


SEQ ID NO 


• 1573 


fl f1 P 51 "fr^t f t n rt t r» a Q+r^ 
ay »-rci L i ly y Li> LUdCcI ig 


f D 1 D 


/ boo 


1 


4 


SEO ID WO 




G 1 1 Ldl LU<d draS y eS a g CCCa 




12953 


1 


4 


SEQ ID NO 




lUiyyuaciyoialclCcigi 


QQ70 


OOcI 1 


1 


4 


SEO ID NO 




loeadLLygyagayacaagt 




oo1o 


1 


4 


SEQ ID NO 


• 1577 


Ly ^ u Ly ly cay u la LUag 




Q7n/i 


1 


4 


SEQ ID NO 


' 1578 


v^oy L\^c] ly t.dy o daa aUll 




A A Ar\ 


1 


4 


SEQ ID NO 


' 157Q 


ly LdULy ydddduyidCag 


DOOU 


boyy 


1 


4 


SEQ ID NO 




ddLdLLydLLpdolLiyidd 


D-H- 1 / 


D4ob 


\ 


4 


SEQ ID NO 


' 1581 

1 WW 1 


^iiiy dddddL'Clddy U«ya 


•1 -i Q"! O 
1 1 O 1 


■1 QQ-I 

1 1 oo\ 


1 


4 


SEQ ID NO 


• 1582 


iLLu>diLLy dday adladd 


( UZH 


'7r\A 




4 


SEQ ID NO 


' 1583 


Sit1't/*aanr»aan oor*tf o o 
diiii^ddyuddy ddulldd 




\ U440 


1 


4 


SEQ ID NO 


1584 


attnnr'ntnnan^ttar'tn 
duyyuy lyy dyuiiduiy 


O 1 zo 


Ol4z 


•1 


4 


SEQ ID NO 


1585 


ttttnritnnanaanrT'a/^a 
iiLiyv^iyydy ddy uodUd 




10776 


1 


4 


SEQ ID NO 


1586 


OlliyOdUldiy llCaldd 




12775 


1 


4 


SEQ ID NO 


1587 

■ wU f 


sasjpapotQanantaaa/^^ 
aaawdwwiddy dy IdddwU 


wfUUD 


9025 


1 


4 


SEQ ID NO 


1588 


sastnr'f riapatanrtnaaf 
aaaiy o ly a^didy y y ddL 




8448 


1 


4 


SEQ ID NO 


1589 


n 9 99 ta fta tna a r»tf n a a 
y aaa LdLidiyddOiLy dd 




13323 


1 


4 


SEQ ID NO 


1590 


kkLov/Ldddy oiyydiy Ld 


1 1 i oo 


11187 


1 


4 


SEQ ID NO' 


1591 


Panntpr'afnr'aanf r*an/* 
^dy y LOLrd ly uddy lUdy u 


•1 noi A 

\ U9 1 1 


10930 1 


4 


SEO ID NO' 




p pa n r*f t/^/^ r' o <Q 
LrUdy U I LUUUUdCdlClCd 


oooo 


8352 


1 


4 


SEQ ID NO' 


1593 

1 w w w 


fp+tpn+ntttnaar^f nr*r» 
L i^y Ly L I LUddu Ly 


\ 1 Z i o 


11232 


1 


4 


SEQ ID NO' 


1594 


\-raay Lady lyoidy y ILOd 




9391 


1 


4 


SEQ ID NO' 

^^'^K VAC 1 1 <i ■ 


1 595 


^la^ddLrd^^ I idLr Liy dd L Lw 


1 uoou 


10679 


1 


yl 
4 


SEQ ID NO* 


1596 


tpanaaanptappf+rT'Qrt 
a y CI a a y o L d v.r o I Lo O d y 




7950 


1 


4 


SEQ ID NO* 

V^ VJC 1 1 ^ k 


1597 

I WW f 


?9tf1 fi a pttp tf ptn na a a a 
ci ly y a V I LV/ L LLr Ly y dddd 


RATH 


8889 


1 


A 

4 


SEQ ID NO' 

v^ VAC 1 i ^ • 


1598 

1 w wU 


pfpatp+ppttfpttr'atr' 
\^ L wd LV^ L^OLL LV.* L L wd LLf 


1 UZU J 


10220 


1 


yl 
4 


SEQ ID NO' 


1599 

1 w^w 


pttHptaaapHnaaa+t 
OLLLLOidddOLLydddll 




9075 


1 


yl 
4 




1600 


ttatgaacttgaagaaaag 


13310 


13329 


1 


4 


SEQ ID NO: 


1601 


ttttcacattagatgcaaa 


8413 


8432 


1 


4 


SEQ ID NO: 


1602 


aaaattgatgatatctgga 


10719 


10738 


1 


4 


SEQ ID NO: 


1603 


aaaagggtcatggaaatgg 


8885 


8904 


1 


4 


SEQ ID NO: 


1604 


ctcaattttgattttcaag 


8520 


8539 


1 


4 


SEQ ID NO: 


1605 


attccctccattaagttct 


11700 


11719 1 


4 


SEQ ID NO: 


1606 


tttcaagcaagaacttaat 


10427 


10446 


1 


4 
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SEQ ID NO: 604 


gaagcacatcaatattgat 


6407 


6426 


SEQ ID NO: 605 


acatcaatattgatcaatt 


6412 


6431 


SEQ ID NO: 606 


gaaaactcccacagcaagc 


6457 


6476 


SEQ ID NO: 607 


ctgaattcattcaattggg 


6486 


6505 


SEQ ID NO: 608 


tgaattcattcaattg gga 


6487 


6506 


SEQ ID NO: 609 


aactgactgctctcacaaa 


6532 


6551 


SEQ ID NO: 610 


aaaagtatagaattacaga 


6550 


8569 


SEQ ID NO: 611 


atcaactttaatg aaaaac 


6603 


6622 


SEQ ID NO: 612 


tgatttgaaaatagctatt 


6686 


6705 


SEQ ID NO: 613 


atttgaaaatagctattgc 


6688 


6707 


SEQ ID NO: 614 


attgctaatattattgatg 


6702 


6721 


SEQ ID NO: 615 


gaaaaattaaaaagtcttg 


6729 


6748 


SEQ ID NO: 616 


actatcatatccgtgtaat 


6754 


6773 


SEQ ID NO: 617 


tattgattttaacaaaagt 


6815 


6834 


SEQ ID NO: 616 


ctgcagcagcttaagagac 


6906 


6925 


SEQ ID NO: 619 


aaaacaacacattgaggct 


6965 


6984 


SEQ ID NO: 620- 


ttgagcatgtcaaacactt 


7051 


7070 


SEQ ID NO: 621 


tttgaagtagctgagaaaa 


7092 


7111 


SEQ ID NO: 622 


ttagtagagttggcccacc 


7191 


7210 


SEQ ID NO: 623 


tgaaggagactattcagaa 


7219 


7238 


SEQ ID NO: 624 


gagactattcagaagctaa 


7224 


7243 


SEQ ID NO: 625 


aattagttggatttattga 


7285 


7304 


SEQ ID NO: 626 


gcttaatgaattatctttt 


7319 


7338 


SEQ ID NO: 627 


ttaacaaattccttgacat 


7357 


7376 


SEQ ID NO; 628 


aaattaaagtcatttgatt 


7386 


7405 


SEQ ID NO: 629 


gactcaatggtgaaattca 


7456 


7475 


SEQ ID NO: 630 


gaaattcaggctctggaac 


7467 


7486 


SEQ ID NO: 631 


actaccacaaaaagctgaa 


7484 


7503 


SEQ ID NO: 632 


ccaaaataaccttaatcat 


7570 


7589 


SEQ ID NO: 633 


aaataaccttaatcatcaa 


7573 


7592 


SEQ ID NO: 634 


tttaagttcagcatctttg 


7607 


7626 


SEQ ID NO: 635 


caggtttatagcacacttg 


7731 


7750 


SEQ ID NO: 636 


gttcactgttcctgaaatc 


7862 


7881 


SEQ ID NO: 637 


cactgttcctgaaatcaag 


7865 


7884 


SEQ ID NO: 638 


actgttcctgaaatcaaga 


7866 


7885 


SEQ ID NO: 639 


gcctgcctttgaagtcagt 


7901 


7920 


SEQ ID NO: 640 


taacagatttgaggattcc 


7972 


7991 


SEQ ID NO: 641 


gttttccacaccagaattt 


8042 


8061 



SEQ ID NO 


: 1607 


atcagttcagataaacttc 


7991 


8010 


1 


4 


SEQ ID NO 


: 1608 


aattccctgaagttgatgt 


11478 


> 11498 


\ 1 


4 


SEQ ID NO 


: 1609 


gctttctcttccacatttc 


10052 


10071 


1 


4 


SEQ ID NO 


: 1610 


cccatttacag atcttcag 


11363 


11382 


1 


4 


SEQ ID NO 


: 1611 


tcccatttacagatcttca 


11362 


11381 


1 


4 


SEQ ID NO 


: 1612 


tttgaggattccatcagtt 


7979 


7998 


1 


4 


SEQ ID NO 


: 1613 


tctggctccctcaactttt 


9042 


9061 


1 


4 


SEQ ID NO 


1614 


gtttattgaaaatattgat 


6803 


6822 


1 


4 


SEQ ID NO 


1615 


aatattattgatg aaatca 


6708 


6727 


1 


4 


SEQ ID NO: 


1616 


gcaagaacttaatggaaat 


10433 


10452 


1 


4 


SEQ ID NO: 


1617 


catcacactgaataccaat 


10151 


10170 


1 


4 


SEQ ID NO: 


1618 


caagagcttatgggatttc 


11153 


11172 


1 


4 


SEQ ID NO: 


1619 


attactttgagaaattagt 


7273 


7292 


1 


4 


SEQ ID NO: 


1620 


acttgacttcagagaaata 


11396 


11415 


1 


4 


SEQ ID NO: 


1621 


gtcttcagtgaagctgcag 


10691 


10710 


1 


4 


SEQ ID NO: 


1622 


agcctcacctcttactttt 


10563 


10582 


1 


4 


SEQ ID NO: 


1623 


aagtagctgagaaaatcaa 


7096 


7115 


1 


4 


SEQ ID NO: 


1624 


ttttcacattagatgcaaa 


8413 


8432 


1 


4 


SEQ ID NO: 


1625 


ggtg g actcttg clgctaa 


7768 


7787 


1 


4 


SEQ ID NO: 


1626 


ttctcaattttgattttca 


8518 


8537 


1 


4 


SEQ ID NO: 


1627 


ttagccacagctctgtctc 


10293 


10312 


1 


4 


SEQ ID NO: 


1628 


tcaagaagcttaatgaatt 


7312 


7331 


1 


4 


SEQ ID NO: 


1629 


aaaacgagcttcaggaagc 


13201 


13220 


1 


4 


SEQ ID NO: 


1630 


atgtcctacaacaagttaa 


7246 


7265 


1 


4 


SEQ ID NO: 


1631 


aatcctttgacaggcattt 


9715 


9734 


1 


4 


SEQ ID NO: 


1632 


tgaaattcaatcacaagtc 


9068 


9087 


1 


4 


SEQ ID NO: 


1633 


gttctcaattttgattttc 


8517 


8536 


1 


4 


SEQ ID NO: 


1634 


ttcaggaactattgctagt 


10637 


10656 


1 


4 


SEQ ID NO: 


1635 


atgatttccctgaccttgg 


10942 


10961 


1 


4 


SEQ ID NO: 


1636 


ttgaagtaaaagaaaattt 


10741 


10760 


1 


4 


SEQ ID NO: 


1637 


caaatctggatttcttaaa 


9472 


9491 


1 


4 


SEQ ID NO: 


1638 


caag g g ttca ctgttcctg 


7857 


7876 


1 


4 


SEQ ID NO: 


1639 


gattctcagatgagggaac 


8914 


8933 


1 


4 


SEQ ID NO: 


1640 


cttgaacacaaagtcagtg 


6000 


6019 


1 


4 


SEQ ID NO: 


1641 


tcttgaacacaaagtcagt 


5999 


6018 


1 


4 


SEQ ID NO: 


1642 


actgttgactcaggaaggc 


12572 


12591 


1 


4 


SEQ ID NO: 


1643 


ggaagcttctcaagagtta 


13214 


13233 


1 


4 


SEQ ID NO: 


1644 


aaatttctctgctggaaac 


9410 


9429 


1 


4 
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SEQ ID NO: 642 


tcagaaccattgaccagat 


8128 


8147 


SEQ ID NO: 643 


tagcgagaatcaccctgcc 


8218 


8237 


SEQ ID NO: 644 


ccttaatgattttcaagtt 


8291 


8310 


SEQ ID NO: 645 


acataccagaattccagct 


8320 


8339 


SEQ ID NO: 646 


aatgctgacatagggaatg 


8430 


8449 


SEQ ID NO: 647 


atgctgacatagggaatgg 


8431 


8450 


SEQ ID NO: 648 


aaccacctcagcaaacgaa 


6450 


8469 


SEQ ID NO: 649 


agcaggtatcgcagcttcc 


8468 


8487 


SEQ ID NO: 650 


tgcacaactctcaaaccct 


8543 


8562 


SEQ ID NO: 651 


aggagtcagtgaagttctc 


8584 


8603 


SEQ ID NO: 652 


tftttggaaatgccattga 


8644 


8663 


SEQ ID NO: 653 


aatggagtgattgtcaaga 


8721 


8740 


SEQ ID NO: 654 


gtcaagataaacaatcagc 


8733 


8752 


SEQ ID NO: 655 


tccacaaattgaacatccc 


8779 


8798 


SEQ ID NO: 656 


ttgaacatccccaaactgg 


8787 


8806 


SEQ ID NO: 657 


acatccccaaactggactt 


8791 


8810 


SEQ ID NO: 658 


acttctctagtcaggctga 


8806 


8825 


SEQ ID NO: 659 


tgaatcacaaattagtttc 


8936 


8955 


SEQ ID NO: 660 


agaaggaccxjctcacttcc 


8960 


8979 


SEQ ID NO: 661 


ttggactgtccaataagat 


8980 


8999 


SEQ ID NO: 662 


actgtccaataagatcaat 


8984 


9003 


SEQ ID NO: 663 


ctgtccaataagatcaata 


8985 


9004 


SEQ ID NO: 664 


gtttatgaatctggctccc 


9033 


9052 


SEQ ID NO: 665 


atgaatctggctccctcaa 


9037 


9056 


SEQ ID NO: 666 


ctcaacttttctaaacttg 


9051 


9070 


SEQ ID NO: 667 


ctaaaggcatggcactgtt 


9121 


9140 


SEQ ID NO: 668 


aaggcatggcactgtttgg 


9124 


9143 


SEQ ID NO: 669 


atccacaaacaatgaaggg 


9254 


9273 


SEQ ID NO: 670 


ggaatttgaaagttcgttt 


9271 


9290 


SEQ ID NO: 671 


aataactatgcactgtttc 


9324 


9343 


SEQ ID NO: 672 


gaaacaacgagaacattat 


9424 


9443 


SEQ ID NO: 673 


ttcttgaaaacgacaaagc 


9591 


9610 


SEQ ID NO: 674 


ataagaaaaacaaacacag 9640 


9659 


SEQ ID NO: 675 


aaaacaaacacag gcattc 


9646 


9665 


SEQ ID NO: 676 


gcattccatcacaaatcct 


9659 


9678 


SEQ ID NO: 677 


tttgaaaaaaacagaaaca 


9732 


9751 


SEQ ID NO: 678 


caatgcattagattttgtc 


9749 


9768 


SEQ ID NO: 679 


caaagctgaaaaatctcag 


9809 


9828 



SEQ ID NO 


: 1645 


atctgcagaacaatactaa 


1243G 




4 


SEQ ID NO 


: 1646 


ggcagcttctqqcttqcta 


12293 


12312 1 


4 


SEQ ID NO 


: 1647 


aactg ttQ actcao a aaa a 


12571 


lOAQn 1 


4 


SEQ ID NO 


: 1648 


aqctaccaatccttcatat 




lUUO/ 1 




SEQ ID NO 


: 1649 


cattaatcctoccatcatt 


9QQ7 


1 UU 1 D 1 




SEQ ID NO 


: 1650 


ccatttQaaatcacaacat 


9237 


Q9*%fi 1 


A 


SEQ ID NO 


: 1551 


ttcqttttccatta aa att 








SEQ ID NO 


: 1652 


ggaaqtcjQCcctaaatoct 


10964 






SEQ ID NO 


: 1653 


aqaqaaaojaaaaaatlaca 


1'^4Q'^ 


1 0\J 1 ^ 1 


A 


SEQ ID NO. 


' 1654 


qaq aactta eta tcatcct 


13780 


1?7QQ 1 


A 


SEQ ID NO: 


1655 


tcaatgaatttattcaaaa 


13186 


1 w&iViJO 1 


4 


SEQ ID NO: 


1656 


tcttttcagcccaqccatt 


9223 


9242 1 


4 


SEQ ID NO: 


1657 


Qctoactttaaaatctoac 


4811 


4830 1 


A 


SEQ ID NO: 


1658 


gggatttcctaaaactaaa 


1 1 1 64 


11183 1 
III 1 


4 


SEQ ID NO: 


1659 


ccaqtticcaqqqactcaa 


12595 


12614 1 


*+ 


SEQ ID NO: 


1660 


aag teg attcccaq cata t 


9082 


9101 1 

w 1 w 1 1 


4 


SEQ ID NO: 


1661 


tcagatqgaaaaatqaaat 


1 1002 


11021 1 

1 1 1 1 


4 


SEQ ID NO: 


1662 


gaaagtccataatqqttca 


12809 


12828 1 


4 


SEQ ID NO: 


1663 


ggaagaagaggcagcttct 


12284 


12303 1 


4 


SEQ ID NO: 


1664 


atctaaatgcagtagccaa 


11626 


11645 1 


4 


SEQ ID NO: 


1665 


attgataaaaccatacaqt 


13883 


13902 1 

1 W W W iL-. 1 


4 


SEQ ID NO: 


1666 


tattgataaaaccatacaa 


13882 


13901 1 


t 


SEQ ID NO: 


1667 


ggqaatctQataaoaaaac 


12247 




A 
H 


SEQ ID NO: 


1668 


ttgaqttqcccaccatcat 


1 1659 


1 1R78 1 


A 


SEQ ID NO: 


1669 


caaqatcq ca q actttq aa 


1 1645 


1 1 RR4 1 


/I 
t 


SEQ ID NO: 


1670 


aacaqaaacaatacattaa 


9741 


9760 1 


yf 


SEQ ID NO: 


1671 


ccaagaaaaqqcacacctt 


1 1069 


11088 1 


4 
*+ 


SEQ ID NO: 


1672 


ccctaacaqatttq a q q at 


7969 


7988 1 


4 


SEQ ID NO: 


1673 


aaacaaacacaggcattcc 


9647 


9666 1 


4 


SEQ ID NO: 


1674 


gaaatactgttttcctatt 


12828 


12847 1 


4 


SEQ ID NO: 


1675 


ataaactgcaagatttttc 


13600 


13619 1 

1 WW 1 w 1 


4 


SEQ ID NO' 


1676 


0 nt ft p r* 3 3 f n a rr^ 0 0 <^ a Q 






4 


SEQ ID NO: 


1677 


ctgtgctttgtgagtttat 


9682 


9701 1 


4 


SEQ ID NO: 


1678 


gaatttgaaagttcgtttt 


9272 


9291 1 


4 


SEQ ID NO: 


1679 


aggaagtggccctgaatgc 


10963 


10982 1 


4 


SEQ ID NO: 


1680 


tgttgaaagatttatcaaa 


12925 


12944 1 


4 


SEQ ID NO: 


1681 


gacaagaaaaaggggattg 


10271 


10290 1 


4 


SEQ ID NO: 


1682 


ctgagaacttcatcatttg 


11430 


11449 1 


4 
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SEQ ID NO: 


680 


cctggatacactgttccag 


9855 


9874 


SEQ ID NO: 1683 


ctggacttctctagtcaqq 


8802 


8821 


i 

1 


4 


SEQ ID NO: 681 


gttgaagtgtctccattca 


9882 


9901 


SEQ ID NO: 1684 


tgaatctg g ctccctcaac 


9038 


9057 

w W W 1 


1 

1 


4 


SEQ ID NO: 


682 


tttctccatcctaggttct 


9956 


9975 


SEQ ID NO: 1685 


agaatccaqatacaaaaaa 


6885 


6904 


i 
1 


4 


SEQ ID NO: 


683 


ttctccatcctaggttctg 


9957 


9976 


SEQ ID NO: 1686 


cagaatccaqatacaaaaa 


6884 


6903 

w w W w 


1 


4 


SEQ ID NO: 


684 


tcattagagctgccagtcc 


10011 


10030 


SEQ ID NO: 1687 


ggacagtgaaatattatqa 


13297 


13316 


1 

1 


4 


SEQ ID NO: 


685 


tgctgaactttttaaccag 


10169 


10188 


SEQ ID NO: 1688 


ctggatgtaaccaccaaca 


11178 


1 1 197 
III 1 




4 


SEQ ID NO: 686 


ctcctttcttcatcttcat 


10206 


10225 


SEQ ID NO: 1689 


atgaagcttqctccaqaao 


13764 


13783 


1 




SEQ ID NO: 


687 


tgtcattgatgcactgcag 


10226 


10245 


SEQ ID NO: 1690 


ctg eg otaccaqaaaq aca 


12072 


12091 


1 


4 


SEQ ID NO: 688 


tg atgcactgcagtacaaa 


10232 


10251 


SEQ ID NO: 1691 


tttgagttgcccaccatca 


11658 


1 1677 


1 


4 


SEQ ID NO: 


689 


agctctgtctctgagcaac 


10301 


10320 


SEQ ID NO: 1692 


gttgaccacaagcttagct 


10539 


10558 


1 


4 


SEQ ID NO: 


690 


agccgaaattccaattttg 


10400 


10419 


SEQ ID NO: 1693 


caaaqctqqcaGcaQQQct 


13963 


13982 


1 
1 


4 


SEQ ID NO: 691 


ttgagaatgaatttcaagc 


10416 


10435 


SEQ ID NO: 1694 


Qcttcaggaaacttctcaa 


13208 


13227 


1 


4 


SEQ ID NO: 


692 


aaacctactgtctcttcct 


10461 


10480 


SEQ ID NO: 1695 


aggaaggccaagccagttt 


12583 


12602 


1 


4 


SEQ ID NO: 


693 


tacttttccattgagtcat 


10575 


10594 


SEQ ID NO: 1696 


atqattatqtcaacaaata 


12355 


12374 


1 


4 


SEQ ID NO: 


694 


tcaggtccatgcaagtcag 


10910 


10929 


SEQ ID NO: 1697 


ctg acatctta gg ca ctg a 


4993 


5012 


1 


4 


SEQ ID NO: 


695 


atgcaagtcagcccagttc 


10918 


10937 


SEQ ID NO: 1698 


gaactcag aagq atg qcat 


13994 


14013 


1 


4 

• 


SEQ ID NO: 696 


tgaatgctaacactaagaa 


10975 


10994 


SEQ ID NO: 1699 


ttctcaattttg attttca 


8518 


8537 


1 


4 


SEQ ID NO: 697 


agaagatcagatggaaaaa 


10996 


11015 


SEQ ID NO: 1700 


ttttctaaatggaacttct 


12165 


12184 1 


4 


SEQ ID NO: 


698 


ggctattcattctccatcc 


11256 


11275 


SEQ ID NO: 1701 


ggatctaaatgcagtagcc 


11624 


11643 


1 


4 


SEQ ID NO: 


699 


aaagttttggctgataaat 


11280 


11299 


SEQ ID NO: 1702 


atttcttaaacattccttt 


9481 


9500 


1 


4 


SEQ ID NO: 


700 


agttttggctgataaattc 


11282 


11301 


SEQ ID NO: 1703 


gaatctqgctccctcaact 


9039 


9058 


1 


4 


SEQ ID NO: 


701 


ctgg gctg aaactaaatg a 


11308 


11327 


SEQ ID NO: 1704 


tcattctg gg tctttcca g 


11027 


11046 


1 


4 


SEQ ID NO: 


702 


cag agaaatacaaatct at 


11405 


11424 


SEQ ID NO: 1705 


ataqcatqqacttcttcta 


8865 


8884 


1 


4 


SEQ ID NO: 


703 


gaggtaaaattccctgaag 


11472 


11491 


SEQ ID NO: 1706 


cttctggcttgctaacctc 


12298 


12317 


1 


4 


SEQ ID NO: 


704 


cttttttgagataaccgtg 


11537 


11556 


SEQ ID NO: 1707 


cacqqaqttactqaaaaaa 


13715 


13734 


1 


4 


SEQ ID NO: 


705 


g ctg ga attgtca ttcctt 


11727 


11746 


SEQ ID NO: 1708 


aaggcatctccacctcagc 


12094 


12113 


1 


4 


SEQ ID NO: 


706 


gtgtataatgccacttgga 


11787 


11806 


SEQ ID NO: 1709 


tccaagatgagatcaacac 


13096 


13115 


1 


4 


SEQ ID NO: 


707 


attccacatgcagctcaac 


11851 


11870 


SEQ ID NO: 1710 


gttgagaagccccaagaat 


6246 


6265 


1 


4 

* 


SEQ ID NO: 


708 


tgaagaagatggcaaattt 


11984 


12003 


SEQ ID NO: 1711 


aaattctcttttcttttca 


9212 


9231 


1 


4 


SEQ ID NO: 


709 


atcaaaagcccagcgttca 


12042 


12061 


SEQ ID NO: 1712 


tg aa a g tcaa g catctg at 


12661 

1 1 


12680 


1 


4 


SEQ ID NO: 


710 


gtgggcatggatatggatg 


12135 


12154 


SEQ ID NO: 1713 


catccttaacaccttccac 


8063 


8082 


1 


4 


SEQ ID NO: 


711 


aaatggaacttctactaca 


12171 


12190 


SEQ ID NO: 1714 


tctaccataaaccatattt 


10080 


10099 


1 


4 


SEQ ID NO: 


712 


aaaaactcaccatattcaa 


12211 


12230 


SEQ ID NO: 1715 


ttgatgttagagtgctttt 


6985 


7004 


1 


4 


SEQ ID NO: 


713 


ctgagaagaaatctgcaga 


12420 


12439 


SEQ ID NO: 1716 


tctgcacagaaatattcag 


13439 


13458 


1 


4 


SEQ ID NO: 


714 


acaatgctgagtgggttta 


12439 


12458 


SEQ ID NO: 1717 


taaatggagtctttattgt 


14078 


14097 


1 


4 


SEQ ID NO: 


715 


caatgctgagtgggtttat 


12440 


12459 


SEQ ID NO: 1718 


ataaatggagtctttattg 


14077 


14096 1 


4 


SEQ ID NO: 


716 


ttaggcaaattgatgatat 


12469 


12488 


SEQ ID NO: 1719 


atattgtcagtgcctctaa 


13384 


13403 


1 


4 


SEQ ID NO: 


717 


ataaactaatagatgtaat 


12889 


12908 


SEQ ID NO: 1720 


attactalgaaaaatttat 


13633 


13652 


1 


4 
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SEQ ID NO 


718 


ccaa eta a ta CI a a ci a ta ac 


13031 


13050 


SEQ ID NO* 1721 


a ttatttta etaaactta a 


14044 


14063 


i 

1 


4 


SEQ ID NO 


719 


ttaattatatccaaaatoa 


13087 


13106 


SEQ ID NO* 1722 


tcatcctctaattttttaa 


13792 


1381 1 


1 

1 


4 


SEQ ID NO' 


720 


tttaaattnttnaaana^a 


13143 


13162 


SEO ID NO* 1723 


tttna tttn aa a aaataaa 


7024 


7043 


1 


4 


SEQ ID NO: 


721 


santtpaatoasittl'attc 


13182 


13201 


SEQ ID NO' 1724 


a a ataccaata eta aactt 


10160 

1 w 1 \J \J 


10179 

1 w 1 / w 


1 

1 


4 

• 


SEQ ID NO: 


722 


tfaaaaaaasiaataatcao 


13318 


13337 


SEQ ID NO* 1725 


etaaaaaaaatatetteaa 


12399 

1 £b b^ 


12418 


1 

1 


4 


SEQ ID MO: 


723 


31 (T ft r* p tl'P tn A Sit 31 1 a t 


13369 

1 w w w ^ 


13388 


SEO ID NO* 1726 


flitstntaaaaccttcjaaat 

d LCA iWLy y <9d V w b!.y dwiy & 


10729 


10748 


1 
1 


4 


SEQ ID NO: 


724 


caca o aaatattcaa o a at 


13443 


13462 


SEQ ID NO- 1727 


attecctaaaottaatata 


11480 


11499 


1 


4 


SEQ ID NO: 


725 


ccstl'Gcoacaasiaaaaat 


13552 


13571 


SEQ ID NO' 1728 


attttta tteeta coata a 


10095 

1 \^ b^ W b^ 


10114 


1 

i 


4 


SEQ ID NO: 


726 


tataaactacaaaattttt 


13599 


13618 


SEQ ID NO' 1729 


aa aattca aacta cctata 


13865 


13884 

1 ^^^^ 1 


1 


4 


SEQ ID NO: 727 


tctoattactataaaaaat 


13629 


13648 


SEQ ID NO: 1730 


atttataaa aaaataeaaa 


6428 


6447 


1 


4 


SEQ ID NO: 


728 


Q a a o ttactaaa a aaa eta 


13718 


13737 


SEQ ID NO 1731 


caaeatacctaatttctcc 


9944 

brff b^ • 1 


9963 


1 


4 


SEQ ID NO: 


729 


t a aa Qctta ctccaa a a a a 


13765 


13784 


SEQ ID NO: 1732 


tctcctttcttcatettca 


10205 


10224 


1 


4 


SEQ ID NO: 


730 


ta aactaa accta caccaa 


13947 


13966 


SEQ ID NO* 1733 


ttaataaaacaaaaattea 


7848 


7867 


1 


4 


SEQ ID NO: 


731 


ttaclaaacttaaGQQ aaa 


14050 


14069 


SEQ ID NO: 1734 


cctcctacaatqqtqqcaa 


4222 


4241 


1 


4 


SEQ ID NO: 


732 


a attcQ aatat ca aattca 


4404 


4423 


SEQ ID NO: 1735 


tgaaaaegacaaagcaatc 


9595 


9614 


3 


3 


SEQ ID NO: 


733 


atttatttatcaaaaaaat 


4543 


4562 


SEQ ID NO: 1736 


acttttetaaactta aaat 

b^ b^ b bb fc^^ b^iM^MbM^^ b ^i^bM^iM b 


9055 

biT bv b^ bur 


9074 


3 


3 


SEQ ID NO: 


734 


tctcaattactaccQctaa 


25 


44 


SEQ ID NO: 1737 


tcaaeceaqccatttaaqa 

■ ^^^^ %J J9 ^^^^ b b b J*] *9 


9228 


9247 


2 


3 


SEQ ID NO: 


735 


a eta aa a aa coca ccx^aa c 


39 


58 


SEQ ID NO: 1738 


actaaatqtaaccaccaac 


11177 


11196 2 


3 


SEQ ID NO: 


736 


ctaotctatccaaaaaato 


219 


238 


SEQ ID NO: 1739 


catcaQaaccattqaecaq 


8126 

'b^ • bir 


8145 


2 


3 


SEQ ID NO: 


737 


ctaaaaattccaataaaat 


283 

^IP/ 


302 


SEQ ID NO* 1740 


aetcaa to a ta aaattea a 


7457 

■ ■ b^ m 


7476 


2 


3 


SEQ ID NO: 


738 


caatacaccctaaaaGaaa 


396 


415 


SEQ ID NO* 1741 


eetcaetteetttaaacta 

b^b^ bb^bM^^ bbb^b^ b fcbbij bM bMb^ bb^ 


8969 

b^ b^ Vr b^ 


8988 


2 


3 

b^ 


SEQ ID NO: 


739 


ctctaaaaaatttactaca 


464 


483 


SEQ ID NO" 1742 


taeaaaettaaettcaaao 

b>9 bi^ li^b^b^b^ h bb«J bar h bb^b^^ b^ b^ 


11391 

1 1 b^ b^ 1 


11410 2 


3 


SEQ ID NO: 


740 


a catcaaa aa aaa catcat 


574 


593 


SEQ ID NO* 1743 


ataaeattcttaaaeatat 

bM b bM brt^^ b4 b bb^ % b bM 29 ^'bM b b^ b 


7042 

1 b^ 1 fc* 


7061 


2 


3 


SEQ ID NO: 


741 


ctaatcaacaacaaccaat 


822 


841 


SEQ ID NO' 1744 


actaaacttetctaatcaa 

b^b^t»bM bM b^b'b bb^ 1*V bb^ b^ Vb^b^b^ 


8801 


8820 


2 


3 


SEQ ID NO: 


742 


aaacactaaaaaaaaacat 


857 


876 


SEQ ID NO- 1745 


ata ceta ca tteeata tec 

b^b b^^^ bb^ bi^b^ b«b^b^bAbb^ hb^b^ 


11346 


11365 2 


3 


SEQ ID NO: 


743 


a a eta ttttaa aa a ctctc 


1079 


1098 


SEQ ID NO: 1746 


aaaaaatatcttcaaaaet 

bMbM29 b^ ^ bb^b^4bMbn29 * 


12403 


12422 2 


3 


SEQ ID NO: 


744 


taaaaaaactaaccatctc 


1105 


1124 


SEQ ID NO: 1747 


qaqatcaacaeaatcttca 


13104 


13123 2 


3 


SEQ ID NO: 


745 


ctaaa eta a □ aa a cctcaa 


1168 


1187 


SEQ ID NO: 1748 


eta aattactacaectcaa 


3027 


3046 


2 


3 


SEQ ID NO: 


746 


taaaacatatacataccaa 


1303 


1322 


SEQ ID NO: 1749 


ttaataqaacaaaqqttea 


7848 


7867 


2 


3 


SEQ ID NO: 


747 


ccttatatocactaaacca 


1432 


1451 


SEQ ID NO: 1750 


taacactatttaaaaaaqq 


9130 


9149 


2 


3 


SEQ ID NO: 


748 


aa a aa eta eta a acattae 


1492 


1511 


SEQ ID NO: 1751 


qcaaq tcaa cccaq ttcct 


10920 


10939 2 


3 


SEQ ID NO: 


749 


oXiig aiiciy ey y y iCai 


-I 

1 DO r 


1 OOO 


<^pn in Mrj- 'i'7'=;9 

OCV^ ILJ IMw. 1 r 


diy dddLrUclcl ly dUctada L 


7490 


7439 


2 




SEQ ID NO: 


750 


teeagaaetcaagtcttea 


1619 


1638 


SEQ ID NO: 1753 


tg aaatacaatgctctg g a 


5512 


5531 


2 


3 


SEQ ID NO: 


751 


ggttcttetteagaetttc 


1736 


1755 


SEQ ID NO: 1754 


gaaataceaagteaaaace 


10447 


10466 2 


3 


SEQ ID NO: 


752 


gttgatgaggagtcettea 


1802 


1821 


SEQ ID NO: 1755 


tgaaaaagctgcaateaac 


13726 


13745 2 


3 


SEQ ID NO: 753 


tccaagatctgaaaaagtt 


1933 


1952 


SEQ ID NO: 1756 


aactgcttcteoaaatg g a 


3544 


3563 


2 


3 


SEQ ID NO: 


754 


agttagtgaaagaagttct 


1948 


1967 


SEQ ID NO: 1757 


agaattcataatcecaact 


8267 


8286 


2 


3 


SEQ ID NO: 755 


gaagggaatcttatatttg 


2076 


2095 


SEQ ID NO: 1758 


caaaacctactgtotcttc 


1 0459 


10478 2 


3 
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SEQ ID NO: 


756 


ggaagctctttttgggaag 


2213 


2232 


SEQ ID NO: 1759 


cttcacataccagaattcc 


8316 


8335 2 


3 


SEQ ID NO: 


757 


tggaataatgctcagtgtt 


2366 


2385 


SEQ ID NO: 1760 


aacaaacacaggcattcca 


9648 


9667 2 


3 


SEQ ID NO: 


758 


gatttgaaatccaaagaag 


2400 


2419 


SEQ ID NO: 1761 


cttcatgtccctagaaatc 


10029 


10048 2 


3 


SEQ ID NO: 


759 


tccaaagaagtcccggaag 


2409 


2428 


SEQ ID NO: 1762 


cttcagcctgctttctgg a 


4943 


4962 2 


3 


SEQ ID NO: 


760 


aggaagggctcaaagaatg 


2562 


2581 


SEQ ID NO: 1763 


cattagagctgccagtcct 


10012 


10031 2 


3 


SEQ ID NO: 


761 


agaatgacttttttcttca 


2575 


2594 


SEQ ID NO: 1764 


tgaag atg acgacttttct 


12152 


12171 2 


3 


SEQ ID NO: 


762 


tttgtg acaaatatg g gca 


2757 


2776 


SEQ ID NO: 1765 


tgccagtttgaaaaacaaa 


11807 


11826 2 


3 


SEQ ID NO: 


763 


ctgaggctaccalgacatt 


3244 


3263 


SEQ ID NO: 1766 


aatgtcagctcttgttcag 


10895 


10914 2 


3 


SEQ ID NO: 


764 


g tag ata ccaaaaaaatg a 


3660 


3679 


SEQ ID NO: 1767 


tcatttgccctcaacctac 


11442 


11461 2 


3 


SEQ ID NO: 


765 


aaatgacttccaatttccc 


3673 


3692 


SEQ ID NO: 1768 


gggaactgttgaaagattt 


12919 


12938 2 


3 


SEQ ID NO: 


766 


atgacttccaatttccctg 


3675 


3694 


SEQ ID NO: 1769 


caggagaacttactatcat 


13777 


13796 2 


3 


SEQ ID NO: 


767 


alctgccatctcgagagtt 


4096 


4115 


SEQ ID NO: 1770 


aactcctccactgaaagat 


9539 


9558 2 


3 


SEQ ID NO: 


768 


atttgtttgtcaaagaagt 


4543 


4562 


SEQ ID NO: 1771 


acttccgtttaccagaaat 


8239 


8258 2 


3 


SEQ ID NO: 


769 


g cag agcttggcctctctg 


5127 


5146 


SEQ ID NO: 1772 


cagagctttctgccactgc 


13510 


13529 2 


3 


SEQ ID NO: 


770 


atatgctgaaatgaaattt 


5345 


5364 


SEQ ID NO: 1773 


aaattcaaactgcctatat 


13866 


13885 2 


3 


SEQ ID NO: 


771 


tcaaaacttgacaacattt 


5412 


5431 


SEQ ID NO: 1774 


aaatacttccacaaattga 


8772 


8791 2 


3 


SEQ ID NO: 


772 


cagtgacctgaaatacaat 


5504 


5523 


SEQ ID NO: 1775 


attgaacatccccaaactg 


8786 


8805 2 


3 


SEQ ID NO: 


773 


tacaaatggcaatgggaaa 


5840 


5859 


SEQ ID NO: 1776 


tttcaactgcctttgtgta 


11221 


11240 2 


3 


SEQ ID NO: 


774 


cttttgtaaagtatgataa 


6277 


6296 


SEQ ID NO: 1777 


ttattgctgaatccaaaag 


13648 


13667 2 


3 


SEQ ID NO: 


775 


ttgtaaagtatgataaaaa 


6280 


6299 


SEQ ID NO: 1778 


ttttcaagcaaatgcacaa 


8531 


8550 2 


3 


SEQ ID NO: 


776 


tccattaacctcccatttt 


6312 


6331 


SEQ ID NO: 1779 


aaaagaaaattttgctgga 


10748 


10767 2 


3 


SEQ ID NO: 


777 


gattatctgaattcattca 


6480 


6499 


SEQ ID NO: 1780 


tgaagtag accaacaaatc 


7154 


7173 2 


3 


SEQ ID NO: 


778 


aattgggagagacaagttt 


6498 


6517 


SEQ ID NO: 1781 


aaactaaatgatctaaatt 


11316 


11335 2 


3 


SEQ ID NO: 


779 


atttgaaaatagctattgc 


6688 


6707 


SEQ ID NO: 1782 


gcaatttctgcacagaaat 


13433 


13452 2 


3 


SEQ ID NO: 


780 


tgag catg tcaaacacttt 


7052 


7071 


SEQ ID NO: 1783 


aaagccattcagtctctca 


12963 


12982 2 


3 


SEQ ID NO: 


781 


ttgaagatgttaacaaatt 


7348 


7367 


SEQ ID NO: 1784 


aattccatatgaaagtcaa 


12652 


12671 2 


3 


SEQ ID NO: 


782 


acttgtcacctacatttct 


7745 


7764 


SEQ ID NO: 1785 


agaatattttgatccaagt 


13268 


13287 2 


3 


SEQ ID NO: 


783 


gttttccacaccagaattt 


8042 


8061 


SEQ ID NO: 1786 


aaatctggatttcttaaac 


9473 


9492 2 


3 


SEQ ID NO: 


784 


ataagtacaaccaaaattt 


9397 


9416 


SEQ ID NO: 1787 


aaataaatggagtctttat 


14075 


14094 2 


3 


SEQ ID NO: 


785 


cgggacctgcggggctgag 


0 


19 


SEQ ID NO: 1788 


ctcagttaactgtgtcccg 


11563 


11582 1 


3 


SEQ ID NO: 


786 


a g tg cccttctcg g ttgct 


17 


36 


SEQ ID NO: 1789 


agcatctgattgactcact 


12670 


12689 1 


3 


SEQ ID NO: 


787 


gctgaggagcccgcccagc 


39 


58 


SEQ ID NO: 1790 


gctgattgaggtgtccagc 


1217 


1236 1 


3 


SEQ ID NO: 


788 


gaggagcccgcccagccag 


42 


61 


SEQ ID NO: 1791 


ctgg atcacag ag tccctc 


3744 


3763 1 


3 


SEQ ID NO: 


789 


gggccgcgaggccgaggcc 64 


83 


SEQ ID NO: 1792 


ggccctgatccccgagccc 


1355 


1374 1 


3 


SEQ ID NO: 


790 


ccaggccgcagcccaggag 


81 


100 


SEQ ID NO: 1793 


ctcccggagccaaggctgg 


2674 


2693 1 


3 


SEQ ID NO: 


791 


ggagccgccccaccgcagc 


96 


115 


SEQ ID NO: 1794 


gctgttttgaagactctcc 


1080 


1099 1 


3 


SEQ ID NO: 


792 


gaagaggaaatgctggaaa 


192 


211 


SEQ ID NO: 1795 


tttcaagttcctgaccttc 


8301 


8320 1 


3 


SEQ ID NO: 


793 


caaaagalgcgacccgatt 


229 


248 


SEQ ID NO: 1796 


aatcttattggggattttg 


7077 


7096 1 


3 
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SEQ 


ID NO: 


794 


attcaagcacctccggaaq 


245 


264 


SEQ ID NO- 1797 


cttccacatttcaaaaaat 




10078 1 


3 

w 


SEQ 


ID NO: 


795 


gttccagtggagtccctgg 


289 


308 


SEQ ID NO' 1798 


ccaocaaatacctaaaaac 


8602 


8691 1 


3 

w 


SEQ 


ID NO: 


796 


gactgctgattcaagaagt 


308 


327 


SEQ ID NO: 1799 


acttaaaoaaaaaataatc 


13316 


13335 1 


3 


SEQ 


ID NO: 


797 


g tg ccaccagg a tcaactg 


325 


344 


SEQ ID NO: 1800 


caQtoaaactocaaaacac 


10696 


10715 1 

1 w f 1 w 1 


3 

w 


SEQ 


ID NO: 


798 


gatcaactgcaaggttgag 


335 


354 


SEQ ID NO: 1801 


ctcacctccacctctn ate 




4759 1 


3 

w 


SEQ 


ID NO: 


799 


actgcaaggttgagctgga 


340 




SEQ ID NO: 1802 


tccactcacatcctacacit 


1281 


1300 1 

1 wUw 1 


3 

w 


SEQ 


ID NO: 


800 


ccagctctgcagcltcatc 


365 


384 


SEQ ID NO: 1803 


oatataatcacctacctaa 


1335 

1 WW\^ 


1354 1 


3 

w 


SEQ 


ID NO: 


801 


agcttcatcctgaagacca 


375 


394 


SEQ ID NO: 1804 


tQQtactaGaoaataaoct 


5104 


5123 1 

w 1 £t.iJ 1 


3 

w 


SEQ 


ID NO: 


802 


cttcatcctgaagaccagc 


377 


396 


SEQ ID NO: 1805 


Qctaaaataaaactaaasa 


2688 


2707 1 


3 

w 


SEQ 


ID NO: 


803 


ccagccagtgcaccctgaa 


391 


410 


SEQ ID NO: 1806 


ttcaaoata ad a cani nn 


1531 


1550 1 

1 www 1 


\j 


SEQ 


ID NO: 


804 


cagtgcaccctgaaagagg 


396 


415 


SEQ ID NO: 1807 


cctcacaQaoctatcacta 


5222 


5241 1 

w^^ 1 1 


3 

w 


SEQ 


ID NO: 


805 


tggctlcaaccctgaqqqc 


419 


438 


SEQ ID NO: 1808 


Qcccactaatcacctacca 


3525 


3544 1 

W W~T^ 1 


3 


SEQ 


ID NO: 806 


cttcaaccctgagggcaaa 


422 


441 


SEQ ID NO: 1809 


tttoaQCcaacattaaaaa 


2199 


2218 1 


3 

w 


SEQ 


ID NO: 


807 


ttcaaccctgagggcaaag 


423 


442 


SEQ ID NO: 1810 


ctttq acaa q catttta aa 


9719 


9738 1 


3 

w 


SEQ 


ID NO: 


808 


cttgctgaagaaaaccaag 


443 


462 


SEQ ID NO: 1811 


cttq a aa ttcaa tcac a a a 


9066 


9085 1 


3 

w 


SEQ 


ID NO: 


809 


tgctgaagaaaaccaagaa 


445 


464 


SEQ ID NO: 1812 


ttct Q eta ccttatcaa ca 


5639 


5658 1 


3 

w 


SEQ 


ID NO: 


810 


ttgctgcagccatgtccag 


475 


494 


SEQ ID NO: 1813 


ctqatcaatttocaaacaa 


2996 


3015 1 


3 

w 


SEQ 


ID NO: 


811 


tgctgcagccatgtccagg 


476 


495 


SEQ ID NO: 1814 


ccta a tcaa ttta caaa ca 


2995 

& www 


3014 1 

WW 1 1 


3 

w 


SEQ 


ID NO: 


812 


ag ccatgtccag g tatg ag 


482 


501 


SEQ ID NO: 1815 


ctcacatcctccaataact 


1285 


1304 1 


3 

w 


SEQ 


ID NO: 


813 


agctcaagctggccattcc 


499 


518 


SEQ ID NO: 1816 


qqaactaccacaaaaaact 


7481 


7500 1 


3 

w 


SEQ 


ID NO: 


814 


agaagggaagcaggttttc 


518 


537 


SEQ ID NO: 1817 


aaaatcttcaatttattct 


13813 

1 W w 1 w 


13832 1 

1 WWW^ 1 


3 

w 


SEQ 


ID NO: 


815 


aagggaagcaggttttcct 


520 


539 


SEQ ID NO: 1818 


aaaacaccaaaataacctt 


7564 

• W W~ 


7583 1 

1 www 1 


3 

w 


SEQ 


ID NO: 


816 


agaaagatgaacctactta 


547 


566 


SEQ ID NO: 1819 


taao aactttaccacttct 


4844 

i w 1 1 


4863 1 

~ WW w 1 


3 

w 


SEQ 


ID NO: 


817 


atcctgaacatcaagaggg 


567 


586 


SEQ ID NO: 1820 


ccctaacaqatttaaaaat 


7969 


7988 1 

f %y w w 1 


3 

w 


SEQ 


ID NO: 


818 


tcctgaacatcaagagggg 


568 


587 


SEQ ID NO: 1821 


cccctaacaaatttaaaaa 


7968 

1 www 


7987 1 

• WW 1 I 


3 

w 


SEQ 


ID NO: 


819 


ctgaacatcaagaggggca 


570 


589 


SEQ ID NO' 1822 


tacctacctttaaaatcaa 


7900 


791 9 1 

f w 1 w 1 


O 


SEQ 


ID NO: 


820 


aacatcaagaggggcatca 


573 


592 


SEQ ID NO: 1823 


tqataaaaaccaaoatatt 


6290 


6309 1 


3 

w 


SEQ 


ID NO: 


821 


acatcaagagg g g catcat 

^^^^ 


574 


593 


SEQ ID NO: 1824 


atqataaaaaccaaqatot 


6289 


6308 1 


3 


SEQ 


ID NO: 


822 


tcatttctgccctcctggt 


589 


608 


SEQ ID NO: 1825 


accaccaatttataoataa 


7405 


7424 1 


3 

w 


SEQ 


ID NO: 


823 


ttcccccagagacagaaga 


607 


626 


SEQ ID NO: 1826 


tcttccacatttcaaaa aa 


10058 

1 w W W W 


10077 1 

1 W W 1 f 1 


3 

w 


SEQ 


ID NO: 


824 


gaagaagccaagcaagtgt 


621 


640 


SEQ ID NO: 1827 


acaccttccacattccttc 


8071 


8090 1 


3 


SEQ 


ID NO: 


825 


ttgtttctggataccqtqt 


639 


658 


SEQ ID NO- 1828 


acantaaatflnttppapaa 

MwClwLaciCl LClwLlwwClwClCI 


8767 


878R 1 


Q 
O 


SEQ 


ID NO: 


826 


tgtatggaaactgctccac 


655 


674 


SEQ ID NO: 1829 


gtggaggcaacacattaca 


2920 


2939 1 


3 


SEQ 


ID NO: 


827 


aaactgctccactcacttt 


662 


681 


SEQ ID NO: 1830 


aaagaaacagcatttgttt 


4532 


4551 1 


3 


SEQ 


ID NO: 


828 


actcactttaccgtcaag a 


672 


691 


SEQ ID NO: 1831 


tcttacttttccattgagt 


10572 


10591 1 


3 


SEQ 


ID NO: 829 


ctttaccgtcaagacgagg 


677 


696 


SEQ ID NO: 1832 


cctccag ctcctg g g a a ag 


2483 


2502 1 


3 


SEQ 


ID NO: 


830 


ttaccgtcaagacgaggaa 


679 


698 


SEQ ID NO: 1833 


ttcctaaagctggatgtaa 


11169 


11188 1 


3 


SEQ 


ID NO: 


831 


acgaggaagggcaatgtgg 


690 


709 


SEQ ID NO: 1834 


ccacaagtcatcatctcgt 


5956 


5975 1 


3 



275 



wo 2004/080406 



PCT/US2004/007070 



SEQ ID NO: 


832 


cgaggaagggcaatgtggc 


691 


710 


SEQ ID NO: 1835 


gccagaagtgagatcctcg 


3507 


3526 


1 


3 


SEQ ID NO: 


833 


gaggaagggcaatgtggca 


692 


711 


SEQ ID NO: 1836 


tgccagtctccatgacctc 


2468 


2487 


1 


3 


SEQ ID NO: 


834 


ggaagggcaatgtggcaac 


694 


713 


SEQ ID NO: 1837 


gttgctcttaaggacttcc 


13356 


13375 


1 


3 


SEQ ID NO: 


835 


gaagggcaatgtggcaaca 


695 


714 


SEQ ID NO: 1838 


tgttgatgaggagtccttc 


1801 


1820 


1 


3 


SEQ ID NO: 


836 


caggcatcagcccacttgc 


769 


788 


SEQ ID NO: 1839 


gcaagtctttcctggcctg 


3011 


3030 


1 


3 


SEQ ID NO: 


837 


aggcatcagcccacttgct 


770 


789 


SEQ ID NO: 1840 


agcaagtctttcctggcct 


3010 


3029 


1 


3 


SEQ ID NO: 


838 


tcagcccacttgctctcat 


775 


794 


SEQ ID NO: 1841 


atgaaagtcaagcatctga 


12660 


12679 


1 


3 


SEQ ID NO: 


839 


gtcaactctgatcagcagc 


815 


834 


SEQ ID NO: 1842 


gctgactttaaaatctgac 


4811 


4830 


1 


3 


SEQ ID NO: 


840 


ggacgctaagaggaagcat 


857 


876 


SEQ ID NO: 1843 


atgcactgtttctgagtcc 


9331 


9350 


1 


3 


SEQ ID NO: 


841 


aaggagcaacacctcttcc 


894 


913 


SEQ ID NO: 1844 


ggaatatcttagcatcctt 


13457 


13476 


1 


3 


SEQ ID NO: 


842 


aggagcaacacctcttcct 


895 


914 


SEQ ID NO: 1845 


agg aatatcttag catcct 


1 3456 


13475 


1 


3 


SEQ ID NO: 843 


caacacctcttcctgcctt 


900 


919 


SEQ ID NO: 1846 


aaggctgactctgtggttg 


4284 


4303 


1 


3 


SEQ ID NO: 


844 


aacacctcttcctgccttt 


901 


920 


SEQ ID NO: 1847 


aaagcaggccgaagctgtt 


1067 


1086 


1 


3 


SEQ ID NO: 845 


acaagaataagtatgggat 


925 


944 


SEQ ID NO: 1848 


atccatgatctacatttgt 


6786 


6805 


1 


3 


SEQ ID NO: 846 


caagaataagtatgggatg 


926 


945 


SEQ ID NO: 1849 


catcactttacaagccttg 


1238 


1257 


1 


3 


SEQ ID NO: 


847 


tagcacaagtgacacagac 


946 


965 


SEQ ID NO: 1850 


gtctcttcgttctatgcta 


4584 


4603 


1 


3 


SEQ ID NO: 


848 


agcacaagtgacacagact 


947 


966 


SEQ ID NO: 1851 


agtctcttcgttctatgct 


4583 


4602 


1 


3 


SEQ ID NO: 


849 


gcacaagtgacacagactt 


948 


967 


SEQ ID NO: 1852 


aagtgtagtctcctggtgc 


5091 


5110 


1 


3 


SEQ ID NO: 


850 


aacttgaagacacaccaaa 


970 


989 


SEQ ID NO: 1853 


tttgaggattccatcagtt 


7979 


7998 


1 


3 


SEQ ID NO: 


851 


gcttctttggtgaaggtac 


1000 


1019 


SEQ ID NO: 1854 


gtacctacttttggcaagc 


8364 


8383 


1 


3 


SEQ ID NO: 


852 


ctttggtgaaggtactaag 


1004 


1023 


SEQ ID NO: 1855 


cttatgggatttcctaaag 


11159 


11178 


1 


3 


SEQ ID NO: 


853 


tactaagaagatgggcctc 


1016 


1035 


SEQ ID NO: 1856 


gagggtagtcataacagta 


10329 


10348 


1 


3 


SEQ ID NO: 


854 


tttgagagcaccaaatcca 


1038 


1057 


SEQ ID NO: 1857 


tggaagtgtcagtggcaaa 


10372 


10391 


1 


3 


SEQ ID NO: 


855 


agagcaccaaatccacatc 


1042 


1061 


SEQ ID NO: 1858 


gatggatatgaccttctct 


4868 


4887 


1 


3 


SEQ ID NO: 


856 


agctgttttgaagactctc 


1079 


1098 


SEQ ID NO: 1859 


gagaacatactgggcagct 


5872 


5891 


1 


3 


SEQ ID NO: 


857 


tgaaaaaactaaccatctc 


1105 


1124 


SEQ ID NO: 1860 


gagaaaatcaatgccttca 


7104 


7123 


1 


3 


SEQ ID NO: 


858 


gaaaaaactaaccatctct 


1106 


1125 


SEQ ID NO: 1861 


agagccaggtcgagctttc 


11044 


11063 


1 


3 


SEQ ID NO; 


859 


tctgagcaaaatatccaga 


1122 


1141 


SEQ ID NO: 1862 


tctgatgaggaaactcaga 


12252 


12271 


1 


3 


SEQ ID NO: 


860 


tctcttcaataagctggtt 


1148 


1167 


SEQ ID NO: 1863 


aacctcccattttttgaga 


6318 


6337 


1 


3 


SEQ ID NO: 


861 


ctgagctgagaggcctcag 


1168 


1187 


SEQ ID NO: 1864 


ctgatccccgagccctcag 


1359 


1378 


1 


3 


SEQ ID NO: 


862 


tgaagcagtcacatctctc 


1190 


1209 


SEQ ID NO: 1865 


gagaaaatcaatgccttca 


7104 


7123 


1 


3 


SEQ ID NO: 


863 


aagcagtcacatctctctt 


1192 


1211 


SEQ ID NO: 1866 


aagaggcagcttctggctt 


12289 


12308 


1 


3 


SEQ ID NO: 


864 


ctctcttgccacagctgat 


1204 


1223 


SEQ ID NO: 1867 


atcaaaagaagcccaagag 


12938 


12957 


1 


3 


SEQ ID NO: 


865 


tcttgccacagctgattga 


1207 


1226 


SEQ ID NO: 1868 


tcaaagttaattgggaaga 


12271 


12290 


1 


3 


SEQ ID NO: 


866 


cttgccacagctgattgag 


1208 


1227 


SEQ ID NO: 1869 


ctcaattttgattttcaag 


8520 


8539 


1 


3 


SEQ ID NO: 


867 


tgaggtgtccagccccatc 


1223 


1242 


SEQ ID NO; 1870 


gatggaaccctctccctca 


4725 


4744 


1 


3 


SEQ ID NO: 


868 


tcagtgtggacagcctcag 


1259 


1278 


SEQ ID NO: 1871 


ctgacatcttaggcactga 


4993 


5012 


1 


3 


SEQ ID NO: 


869 


acatcctccagtggctgaa 


1288 


1307 


SEQ ID NO: 1872 


ttcagaagctaagcaatgt 


7231 


7260 


1 


3 
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SEQ ID NO: 


870 


gcacagcagctgcgagaga 


1377 


1396 


SEQ ID NO: 


871 


cagcagctgcgagagatct 


1380 


1399 


SEQ ID NO: 


872 


gcgagggatcagcgcagcc 


1407 


1426 


SEQ ID NO: 


873 


aagacaaaccctacaggga 


1470 


1489 


SEQ ID NO: 


874 


caggagctgctggacattg 


1491 


1510 


SEQ ID NO: 


875 


aggagctgctggacattgc 


1492 


1511 


SEQ ID NO: 


876 


ctgctggacattgctaatt 


1497 


1516 


SEQ ID NO; 


877 


g attacacc tatttg attc 


1557 


1576 


SEQ ID NO: 


878 


atttgattctgcgggtcat 


1567 


1586 


SEQ ID NO: 


879 


tctgcgggtcattggaaat 


1574 


1593 


SEQ ID NO: 


880 


aaccatggagcagttaact 


1601 


1620 


SEQ ID NO: 


881 


ggagcagttaactccagaa 


1607 


1626 


SEQ ID NO: 


882 


actccagaactcaagtctt 


1617 


1636 


SEQ ID NO: 


883 


tccag aactcaagtcttca 


1619 


1638 


SEQ ID NO: 


884 


aagtacaaagccatcactg 


1655 


1674 


SEQ ID NO: 


885 


gccatcactgatgatccag 


1664 


1683 


SEQ ID NO: 


886 


ccatcactgatgatccaga 


1665 


1684 


SEQ ID NO: 


887 


atccagaaagctgccatcc 


1677 


1696 


SEQ ID NO: 


888 


cagaaagctgccatccagg 


1680 


1699 


SEQ ID NO: 


889 


acaaggaccaggaggttct 


1723 


1742 


SEQ ID NO: 


890 


aggaccaggaggttcttct 


1726 


1745 


SEQ ID NO: 


891 


accag g ag g ttcttcttca 


1729 


1748 


SEQ ID NO: 


892 


tcttcagactttccttgat 


1742 


1761 


SEQ ID NO: 


893 


ttcag actttccttg atg a 


1744 


1763 


SEQ ID NO: 


894 


gttgatgaggagtccttca 


1802 


1821 


SEQ ID NO: 


895 


cttcacaggcagatattaa 


1816 


1835 


SEQ ID NO: 


896 


ttcacaggcagatattaac 


1817 


1836 


SEQ ID NO: 


897 


ggcagatattaacaaaatt 


1823 


1842 


SEQ ID NO: 


898 


atattaacaaaattgtcca 


1828 


1847 


SEQ ID NO: 


899 


acaaaattgtccaaattct 


1834 


1853 


SEQ ID NO: 


900 


gagcaagtgaagaactttg 


1869 


1888 


SEQ ID NO: 


901 


gtgaagaactttgtggctt 


1875 


1894 


SEQ ID NO: 


902 


agaactttgtggcttccca 


1879 


1898 


SEQ ID NO: 


903 


tttgtggcttcccatattg 


1884 


1903 


SEQ ID NO: 


904 


tggcttcccatattgccaa 


1888 


1907 


SEQ ID NO: 


905 


ttcccatattgccaatatc 


1892 


1911 


SEQ ID NO: 


906 


tcccatattgccaatatct 


1893 


1912 


SEQ ID NO: 


907 


ttgcx;aatatcttgaactc 


1900 


1919 
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SEQ ID NO' 


1873 


tctctQaaaapinaacntrir* 


\ 1 o 


1 ZOOH 


1 


o 
O 


SEQ ID NO' 


1874 


aQataacattaapo<? nntn 






1 


o 


SEQ ID NO: 


1875 


QQCtcaacsr:?3nsr?3tpnr> 


O / 1 u 


^^70Q 


1 


o 
o 


SEQ ID NO- 


1876 


tcccaaaaaarrtrttrtt 




oy*f 1 


1 


o 

o 


SEQ ID NO: 


1877 


caatQaaciaafmaarc1"n 






1 


O 


SEQ ID NO- 


1878 


Qcaaaaattr^rlnttrH' 


/ ooo 


/ Oi O 


■i 
1 


Q 

o 


SEQ ID NO: 


1879 


aattQOOaaoOr5iasnnr?Hin 

MM b<!:^ ^ ^ (Cricay aciy Cay y v^Ciy 






1 


o 


SEQ ID NO: 


1880 


QaatattttOBctaGapir^tn 






-1 
1 




SEQ ID NO: 


1881 


atoaaDtaa3ccs[acp;^;nt 


71 


71 79 


1 


q 


SEQ ID NO: 


1882 


atttataaoaaaatacaaa 


6428 


6447 




q 


SEQ ID NO: 


1883 


acstttctccatcctaaatt 


9954 


w57 / w 


1 


q 

w 


SEQ ID NO: 


1884 


ttctaaaaatccaatctcc 


8392 


841 1 


1 


q 


SEQ ID NO: 


1885 


aaqatcacao acttta aa t 


1164B 


1 1665 

1 1 www 


I 


3 

w 


SEQ ID NO: 


1886 


taaactcaaaaaaattaaa 


1912 


1931 

1 WW 1 




q 

w 


SEQ ID NO: 


1887 


caatcatataaaaaaactt 


4421 




1 


q 


SEQ ID NO: 


1888 


cto a a actctctccata a c 


10875 


1 WUw*T 


-1 
1 


q 


SEQ ID NO: 


1889 


tctaaactcaa aaa aata a 


13991 

1 V/ w w 1 


14010 


1 


q 


SEQ ID NO: 


1890 


Qoatttcctaaaactaaat 

53 53 Lw«widy w Ly y CI L 


11 165 

III sjyj 


1 1 1ft4 

1 1 1 O't 


1 


q 


SEQ ID NO: 


1891 


ccto aaatacaatontntn 


5*510 
1 \j 


559Q 

ww^%/ 


•1 
i 


q 


SEQ ID NO: 


1892 


aa aaacaacatttatttat 




45'=;'^ 


1 


q 


SEQ ID NO: 


1893 


aaaaactaaanaafntrrf 




79'^'^ 


1 


q 
o 


SEQ ID NO: 


1894 


taaaaactaactctataot 

^y*-*«y y wty p*w Lv/Ly Lyy l 




tou 1 


■i 
1 


q 


SEQ ID NO: 


1895 


atcaoaaaaaactcaar^a^ 


2559 




1 


q 


SEQ ID NO: 


1896 


tcattactcctaaa eta aa 


1 1299 

1 1 ^5757 


11*^18 

1 1 w 1 O 


•1 
i 


q 


SEQ ID NO: 


1897 


tQaatctoQctccctcaac 


9038 


90*57 


1 


q 


SEQ ID NO: 


1898 


ttaatcoaaaaatataaaa 


7140 


71 5Q 

/ 1 «J57 


1 


q 


SEQ ID NO: 


1899 


attaatcaaaaaatataaa 


7139 


7158 

t 1 wVJ 


1 


q 


SEQ ID NO: 


1900 


aattq cattaa ata ata cc 


6581 


6600 

W w 


1 


q 

w 


SEQ ID NO: 


1901 


tqqaqtttqtqacaaatat 


2752 


2771 


1 


q 

w 


SEQ ID NO: 


1902 


aqaaacaacatttatttat 


4534 

W w~ 


4553 

^rw WW 


1 


q 


SEQ ID NO: 


1903 


caaataacataataaactc 


5326 

w^^ w 


5345 


1 


q 
o 


SEO ID NO- 




aaycaiuigaiigacicac 




•1 0£?0 O 

1^000 


1 


3 


SEQ ID NO: 


1905 


tgggcctgccccagattct 


8901 


8920 


1 


3 


SEQ ID NO: 


1906 


caataag atcaatag caaa 


8990 


9009 


1 


3 


SEQ ID NO: 


1907 


ttggctcacatgaaggcca 


7623 


7642 


1 


3 


SEQ ID NO: 


1908 


gatatacactagggaggaa 


12737 


12756 


1 


3 


SEQ ID NO: 


1909 


agatcaaagttaattggga 


12268 


12287 


1 


3 


SEQ ID NO: 


1910 


gaglcccagtgcccagcaa 


9344 


9363 


1 


3 
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SEQ ID NO: 


908 


ttggatatccaagatctga 


1926 


1945 


SEQ ID NO: 1911 


tcagtataagtacaaccaa 


9392 


9411 


1 


3 


SEQ ID NO: 


909 


tccaagatctgaaaaagtt 


1933 


1952 


SEQ ID NO: 1912 


aacttccaactgtcatgga 


1978 


1997 


1 


3 


SEQ ID NO: 


910 


ctgaaaaagttagtgaaag 


1941 


1960 


SEQ ID NO: 1913 


ctttgaagtcagtcttcag 


7907 


7926 


1 


3 


SEQ ID NO: 


911 


agttagtgaaagaagttct 


1948 


1967 


SEQ ID NO: 1914 


agaatctcaacttccaact 


1970 


1989 


1 


3 


SEQ ID NO: 


912 


aatctcaacttccaactgt 


1972 


1991 


SEQ ID NO; 1915 


acaggggtcctttatgatt 


12342 


12361 


1 


3 


SEQ ID NO: 


913 


gtcatggacti'cagaaaal: 


1989 


2008 


SEQ ID NO: 1916 


atttg aaag aataaatgac 


7028 


7047 


1 


3 


SEQ ID NO: 


914 


tcaactctacaaatctgtt 


2021 


2040 


SEQ ID NO: 1917 


aacacattg aggc tattg a 


6970 


6989 


1 


3 


SEQ ID NO: 


915 


aactctacaaatctgtttc 


2023 


2042 


SEQ ID NO: 1918 


gaaaaaggggattgaagtt 


10276 


10295 


1 


3 


SEQ ID NO: 


916 


aaatagaagggaatcttat 


2071 


2090 


SEQ ID NO: 1919 


ataagcaaactgttaattt 


5449 


5468 


1 


3 


SEQ ID NO: 


917 


agaagggaatcttatattt 


2075 


2094 


SEQ ID NO: 1920 


aaatgcactgctgcgttct 


4892 


4911 


1 


3 


SEQ ID NO: 


918 


g aag g g aatcttatatttg 


2076 


2095 


SEQ ID NO: 1921 


caaaaacattttcaacttc 


5279 


5298 


1 


3 


SEQ ID NO: 


919 


tgatccaaataactacctt 


2093 


2112 


SEQ ID NO: 1922 


aaggaagaaagaaaaatca 


3453 


3472 


1 


3 


SEQ ID NO: 


920 


tg g atttg cttcag ctg ac 


2150 


2169 


SEQ ID NO: 1923 


gtcagcccagttccttcca 


10924 


1 0943 


1 


3 


SEQ ID NO: 


921 


tttgcttcagctgacctca 


2154 


2173 


SEQ ID NO: 1924 


tgaggaaactcagatcaaa 


12257 


12276 1 


3 


SEQ ID NO: 


922 


cttggaaggaaaaggcttt 


2183 


2202 


SEQ ID NO: 1925 


aaagcattggtagagcaag 


7842 


7861 


1 


3 


SEQ ID NO: 


923 


tggaaggaaaaggctttga 


2185 


2204 


SEQ ID NO: 1926 


tcaagtctgtgggattcca 


4078 


4097 


1 


3 


SEQ ID NO: 


924 


ggctttgagccaacattgg 


2196 


2215 


SEQ ID NO: 1927 


ccaagaggtatttaaagcc 


12950 


12969 1 


3 


SEQ ID NO: 


925 


tgagccaacattggaagct 


2201 


2220 


SEQ ID NO: 1928 


agctttctgccactgctca 


13513 


13532 


1 


3 


SEQ ID NO: 


926 


gagccaacattggaagctc 


2202 


2221 


SEQ ID NO: 1929 


gagctttctgccactgctc 


13512 


13531 


1 


3 


SEQ ID NO: 


927 


aacattggaagctcttttt 


2207 


2226 


SEQ ID NO: 1930 


aaaagaaacagcatttgtt 


4531 


4550 


1 


3 


SEQ ID NO: 


928 


tg g a ag ctctttttg g g aa 


2212 


2231 


SEQ ID NO: 1931 


ttccggcacgtgggttcca 


3777 


3796 


1 


3 


SEQ ID NO: 


929 


ctctttltgggaagcaagg 


2218 


2237 


SEQ ID NO: 1932 


ccttactgactttgcagag 


7790 


7809 


1 


3 


SEQ ID NO: 


930 


tttttgggaagcaaggatt 


2221 


2240 


SEQ ID NO: 1933 


aatcattgaaaaattaaaa 


6722 


6741 


1 


3 


SEQ ID NO: 


931 


ttttcccagacagtgtcaa 


2239 


2258 


SEQ ID NO: 1934 


ttgatgaaatcattgaaaa 


6715 


6734 


1 


3 


SEQ ID NO: 932 


ttg gctataccaaag atga 


2323 


2342 


SEQ ID NO: 1935 


tcattgctcccggagccaa 


2668 


2687 


1 


3 


SEQ ID NO: 


933 


ataccaaagatgataaaca 


2329 


2348 


SEQ ID NO: 1936 


tgttgcttttgtaaagtat 


6272 


6291 


1 


3 


SEQ ID NO: 934 


gagcaggatatggtaaatg 


2349 


2368 


SEQ ID NO: 1937 


catttcagccttcgggctc 


4254 


4273 


1 


3 


SEQ ID NO: 


935 


atggtaaatggaataatgc 


2358 


2377 


SEQ ID NO: 1938 


gcatgcctagtttctccat 


9946 


9965 


1 


3 


SEQ ID NO: 


936 


tggtaaatggaataatgct 


2359 


2378 


SEQ ID NO: 1939 


agcacagtacgaaaaacca 


10801 


10820 


1 


3 


SEQ ID NO: 937 


taaatggaataatgctcag 


2362 


2381 


SEQ ID NO: 1940 


ctgaaagagatgaaattta 


13059 


13078 


1 


3 


SEQ ID NO: 


938 


tggaataatgctcagtgtt 


2366 


2385 


SEQ ID NO: 1941 


aacag atttg agg attcca 


7973 


7992 


1 


3 


SEQ ID NO: 


939 


tcagtgttgagaagctgat 


2377 


2396 


SEQ ID NO: 1942 


atcacaactcctccactga 


9534 


9553 


1 


3 


SEQ ID NO: 


940 


cagtgttgagaagctgatt 


2378 


2397 


SEQ ID NO: 1943 


aatcacaactcctccactg 


9533 


9552 


1 


3 


SEQ ID NO: 


941 


agtgttgagaagctgatta 


2379 


2398 


SEQ ID NO: 1944 


taatcacaactcctccact 


9532 


9551 


1 


3 


SEQ ID NO: 


942 


g attaaag atttg aaatcc 


2393 


2412 


SEQ ID NO: 1945 


ggatactaagtaccaaatc 


6866 


6885 


1 


3 


SEQ ID NO: 


943 


gatttgaaatccaaagaag 


2400 


2419 


SEQ ID NO: 1946 


cttccgtttaccagaaatc 


8240 


8259 


1 


3 


SEQ ID NO: 


944 


atttgaaatccaaagaagt 


2401 


2420 


SEQ ID NO: 1947 


acttccgtttaccagaaat 


8239 


8258 


1 


3 


SEQ ID NO: 


946 


atccaaagaagtcccggaa 


2408 


2427 


SEQ ID NO: 1948 


ttccaatttccctgtggat 


3680 


3699 


1 


3 
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SEQ ID NO: 946 


tccaaagaagtcccggaag 


2409 


2428 


SEQ ID NO: 


1949 


cttccaatttccctgtgga 


3679 


3698 1 


3 


SEQ ID NO: 947 


agagcctacctccgcatct 


2430 


2449 


SEQ ID NO: 


1950 


agattaatccgctggctct 


8563 


8582 1 


3 


SEQ ID NO: 948 


gagcctacctccgcatctt 


2431 


2450 


SEQ ID NO: 


1951 


aagattaatccgctg gctc 


8562 


8581 1 


3 


SEQ ID NO: 949 


cttgggagaggagcttggt 


2447 


2466 


SEQ ID NO: 


1952 


accactgggacctaccaag 


12519 


12538 1 


3 


SEQ ID NO: 950 


ggagcttggttttgccagt 


2456 


2475 


SEQ ID NO: 


1953 


actggtggcaaaaccctcc 


2726 


2745 1 


3 


SEQ ID WO: 951 


ttggttttgccagtctcca 


2461 


2480 


SEQ ID NO: 


1954 


tggagaagccacactccaa 


10763 


10782 1 


3 


SEQ ID UO: 952 


cagtctccatgacctccag 


2471 


2490 


SEQ ID NO: 


1955 


ctggtcgcctgccaaactg 


3530 


3549 1 


3 


SEQ ID NO: 953 


otccatgacctccagctcc 


2475 


2494 


SEQ ID NO: 


1956 


ggagtcattgctcccggag 


2664 


2683 1 


3 


SEQ ID NO: 954 


ctgggaaagctgcttctga 


2493 


2512 


SEQ ID NO: 


1957 


tcagaaagctaccttccag 


7931 


7950 1 


3 


SEQ ID NO; 955 


gaggtcatcaggaagggct 


2553 


2572 


SEQ ID NO: 


1958 


agccagaagtgagatcctc 


3506 


3525 1 


3 


SEQ ID NO: 956 


aagaatgacttttttcttc 


2574 


2593 


SEQ ID NO: 


1959 


gaaggcatctgggagtctt 


3827 


3846 1 


3 


SEQ ID NO: 957 


cttttttcttcactacatc 


2582 


2601 


SEQ ID NO: 


1960 


gatgcttacaacactaaag 


6099 


6118 1 


3 


SEQ ID NO: 958 


catcttcatg g ag aatgcc 


2597 


2616 


SEQ ID NO: 


1961 


ggcacttccaaaattgatg 


10710 


10729 1 


3 


SEQ ID NO: 959 


cttcatggagaatgccttt 


2600 


2619 


SEQ ID NO: 


1962 


aaag ttaattg g g aag a ag 


12273 


12292 1 


3 


SEQ ID NO: 960 


aatg cctttg aactcccca 


2610 


2629 


SEQ ID NO: 


1963 


tgggctggcttcagccatt 


5729 


5748 1 


3 


SEQ ID NO: 961 


gcctttgaactccccactg 


2613 


2632 


SEQ ID NO: 


1964 


cagtctgaacattgcaggc 


5375 


5394 1 


3 


SEQ ID NO: 962 


caaggctggagtaaaactg 


2684 


2703 


SEQ ID NO: 


1965 


cagtgcaacgaccaacttg 


5072 


5091 1 


3 


SEQ ID NO: 963 


tg g ag taaaactggaa gta 


2690 


2709 


SEQ ID NO: 


1966 


tactccaacgccagctcca 


3051 


3070 1 


3 


SEQ ID NO: 964 


ggaagtagccaacatgcag 


2702 


2721 


SEQ ID NO: 


1967 


ctgccatctcg ag ag ttcc 


4098 


4117 1 


3 


SEQ ID NO: 965 


tttgtgacaaatatgggca 


2757 


2776 


SEQ ID NO: 


1968 


tgcctttgtgtacaccaaa 


11228 


11247 1 


3 


SEQ ID NO: 966 


tgtgacaaatatgggcatc 


2759 


2778 


SEQ ID NO: 


1969 


gatgggtctctacgccaca 


4377 


4396 1 


3 


SEQ ID NO: 967 


ggacttcgctaggagtggg 


2786 


2805 


SEQ ID NO: 


1970 


cccaaggccacaggggtcc 


12333 


12352 1 


3 


SEQ ID NO: 968 


gtggggtccagatgaacac 


2800 


2819 


SEQ ID NO: 


1971" 


gtgttctagacctctccac 


4171 


4190 1 


3 


SEQ ID NO: 969 


ttccacgagtcgggtctgg 


2826 


2845 


SEQ ID NO: 


1972 


ccagaatctgtaccaggaa 


12554 


12573 1 


3 


SEQ ID NO: 970 


agtcgggtctggaggctca 


2833 


2852 


SEQ ID NO: 


1973 


tgagaactacgagctgact 


4799 


4818 1 


3 


SEQ ID NO: 971 


tcgggtctggaggctcatg 


2835 


2854 


SEQ ID NO: 


1974 


catgaaggccaaattccga 


7631 


7650 1 


3 


SEQ ID NO: 972 


aaaagctg g g aagctg aag 


2861 


2880 


SEQ ID NO: 


1975 


cttccagacacctgatttt 


7943 


7962 1 


3 


SEQ ID NO: 973 


aagctgaagtttatcattc 


2871 


2890 


SEQ ID NO: 


1976 


gaatttacaattgttgctt 


6261 


6280 1 


3 


SEQ ID NO: 974 


gagaccagtcaagctgctc 


2900 


2919 


SEQ ID NO: 


1977 


gagcttcaggaagcttctc 


13206 


13225 1 


3 


SEQ ID NO: 975 


gcaacacattacatttggt 


2926 


2945 


SEQ ID NO: 


1978 


accagtcagatattgttgc 


10183 


10202 1 


3 


SEQ ID NO: 976 


acattacatttggtctcta 


2931 


2950 


SEQ ID NO: 


1979 


tagaatatgaactaaatgt 


11881 


11900 1 


3 


SEQ ID NO: 977 


cattacatttggtctctac 


2932 


2951 


SEQ ID NO: 


1980 


gtagctgagaaaatcaatg 


7098 


7117 1 


3 


SEQ ID NO: 978 


aaacggaggtgatcccacc 


2956 


2975 


SEQ ID NO: 


1981 


ggtggataccctgaagttt 


3197 


3216 1 


3 


SEQ ID NO: 979 


attgagaacaggcagtcct 


2979 


2998 


SEQ ID NO: 


1982 


aggaaaagcgcacctcaat 


12023 


12042 1 


3 


SEQ ID NO: 980 


tgagaacaggcagtcctgg 


2981 


3000 


SEQ ID NO: 


1983 


ccagcttcccx^acatctca 


8333 


8352 1 


3 


SEQ ID NO: 981 


ctgcacctcaggcgcttac 


3035 


3054 


SEQ ID NO: 


1984 


gtaagaaaatacagagcag 


6432 


6451 1 


3 


SEQ ID NO: 982 


iccacagactccgcctcct 


3066 


3085 


SEQ ID NO: 


1985 


aggacagagccttggtgga 


3184 


3203 1 


3 


SEQ ID NO: 983 


ctgaccggggacaccagat 


3093 


3112 


SEQ ID NO: 


1986 


atctgatgaggaaactcag 


12251 


12270 1 


3 
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SEQ ID NO: 


984 


tagagctggaactgaggcc 


3112 


3131 


SEQ ID NO: 1987 


gg cctctctgg g g catcta 


5136 


5155 


1 




SEQ ID NO: 


985 


ctatgagctccagagagag 


3167 


3186 


SEQ ID NO: 1988 


ctctcacaaaaaagtatag 


6541 


6560 






SEQ ID NO; 


986 


cttggtggataccctgaag 


3194 


3213 


SEQ ID NO: 1989 


cttcaggaagcttctcaag 


13209 


13228 


-1 

1 


3 


SEQ ID NO: 


987 


ttgtaactcaagcagaagg 


3214 


3233 


SEQ ID NO: 1990 


ccttacacaataatcacaa 


9522 


9541 


1 

1 


3 


SEQ ID NO: 


988 


taactcaagcagaaggtgc 


3217 


3236 


SEQ ID NO: 1991 


gcacctagctggaaagtta 


6947 


6966 


1 


3 


SEQ ID NO: 


989 


gcagaaggtgcgaagcaga 


322 5 


3244 


SEQ ID NO: 1992 


tctgtgggattccatctgc 


4083 


4102 


1 

1 


3 


SEQ ID NO: 


990 


cagaaggtgcgaagcagac 3226 


3245 


SEQ ID NO: 1993 


g tctg tg gg attccatctg 


4082 


4101 




3 


SEQ ID NO: 


991 


gtatgaccttgtccagtga 


3280 


3299 


SEQ ID NO: 1994 


tcaccaacggagaacatac 


10843 


10862 


1 


3 


SEQ ID NO: 


992 


tatgaccttgtccagtgaa 


3281 


3300 


SEQ ID NO: 1995 


ttcaccaacggagaacata 


10842 


10861 


1 


3 


SEQ ID NO: 


993 


gaagtccaaattccggatt 


3297 


3316 


SEQ ID NO: 1996 


aatctcaagctttctcttc 


10044 


10063 


1 

1 


3 


SEQ ID NO: 


994 


g ag g gcaaaacgtcttaca 


3363 


3382 


SEQ ID NO: 1997 


tgtacaactggtccgcctc 


4207 


4226 


1 

1 


3 


SEQ ID NO: 


995 


agggcaaaacgtcttacag 


3364 


3383 


SEQ ID NO: 1998 


ctgttaggacaccagccct 


4054 


4073 


1 


3 


SEQ ID NO: 


996 


g actcaccctg g acattca 


3382 


3401 


SEQ ID NO: 1999 


tgaaattcaatcacaagtc 


9068 


9087 


1 


3 


SEQ ID NO: 


997 


ctggacattcagaacaaga 


3390 


3409 


SEQ ID NO; 2000 


tcttttcttttcaacccaa 


9218 


9237 




3 


SEQ ID NO: 


998 


tcatgggcgacctaagttg 


3427 


3446 


SEQ ID NO: 2001 


caactgcagacatatatga 


6627 


6646 


1 


3 


SEQ ID NO: 


999 


tgggcgacctaagttgtga 


3430 


3449 


SEQ ID NO: 2002 


tcactccattaacctccca 


6308 


6327 


1 


3 


SEQ ID NO: 


1000 


agttgtgacacaaaggaag 


3441 


3460 


SEQ ID NO: 2003 


cttcttttccaattg aact 


13830 


13849 


1 


3 


SEQ ID NO: 


1001 


tgacacaaaggaagaaaga 3446 


3465 


SEQ ID NO: 2004 


tcttcatcttcatctgtca 


10212 


10231 


1 


3 


SEQ ID NO: 


1002 


gacacaaaggaagaaagaa 3447 


3466 


SEQ ID NO: 2005 


ttcttcatcttcatctgtc 


10211 


10230 


1 


3 


SEQ ID NO: 


1003 


ggaagaaagaaaaatcaag 3455 


3474 


SEQ ID NO: 2006 


cttgtcatgcctacgttcc 


11340 


11359 


1 


3 



SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 



ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 



2007 aaaaagcgatggccgggtc 3947 

2008gtcaaatataccttgaaca 3963 

2009tgaacaagaacagtttgaa 3976 

201 Oagtttgaaaattgagattc 3987 

201 1 gtttgaaaattgagattcc 3988 

201 2ttgaaaattgagattcctt 3990 

201 3ctaaagatgttagagactg 4038 

2014 atg ttagagactgttagg a 4044 

201 5cagccctccacttcaagtc 4066 

2016 agccctccacttcaagtct 4067 

2017 ccatctgccatctcgagag 4094 

2018 attcccaagttgtatcaac 41 34 
201 gtcaactgcaagtgcctctc 4148 

2020 ggtgttctagacctctcca 41 70 

2021 ctccacgaatgtctacagc 41 84 

2022 cacgaatgtctacagcaac 41 87 

2023 acgaatgtctacagcaact 4188 
2024tcctacagtggtggcaaca 4224 
2025cgttaccacatgaaggctg 4272 
2026 gaaggctgactctgtggtt 4283 
2027tgtggttgacctgctttcc 4295 
2028 cctgctttcctacaatgtg 4304 



39663gQ 

3982SEQ 
3995SEQ 
4006SEQ 
4007SEQ 
4009SEQ 
4057SEQ 
4063SEQ 
4085SEQ 
4086SEQ 
4113SEQ 
4153SEQ 
4167SEQ 
4189SEQ 
4203 SEQ 
4206 SEQ 
4207SEQ 
4243SEQ 
4291 SEQ 
4302 SEQ 
4314SEQ 
4323SEQ 



IQ I^Q, 2313gaccttgcaagaatatttt 6335 6354 1 3 

ID NO: 2314tgttaacaaattccttgac 7355 7374 1 3 

ID NO: 2315ttcaagttcctgaccttca 8302 8321 1 3 

ID NO: 2316gaatctggctccctcaact 9039 9058 1 3 

ID NO: 2317ggaaataccaagtcaaaac 1044610465 1 3 

ID NO: 2318aaggaaaagcgcacctcaa 1202212041 1 3 

ID NO: 2319cagttgaccacaagcttag 1053710556 1 3 

ID NO: 2320tccttaacaccttccacat 8065 8084 1 3 

ID NO: 2321gacttctctagtcaggctg 8805 8824 1 3 

ID NO: 2322agacatcgctgggctggct 5720 5739 1 3 

ID NO: 2323ctctcaaatgacatgatgg 5322 5341 1 3 

ID NO: 2324gttgagaagccccaagaat 6246 6265 1 3 

ID NO: 2325gagatcaagacactgttga 8835 8854 1 3 

ID NO: 2326tggaaccctctccctcacc 4727 4746 1 3 

ID NO: 2327gctggtaacctaaaaggag 5580 5599 1 3 

ID NO: 2328gttgcccaccatcatcgtg 1166311682 1 3 

ID NO: 2329agttgcccaccatcatcgt 1166211681 1 3 

ID NO: 2330tgttagttgctcttaagga 1335113370 1 3 

ID NO: 2331cagcaagtacctgagaacg 8603 8622 1 3 

ID NO: 2332aacctatgccttaatcttc 1316113180 1 3 

ID NO: 2333ggaaagttaaaacaacaca 6957 6976 1 3 

D NO: 2334cacaccttgacattgcagg 1108011099 1 3 
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SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 

SEQ 



ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
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2029 ctgctttcctacaatgtgc 4305 

2030tcctacaatgtgcaaggat 431 1 

203 1 tatgaccacaagaatacgt 4344 

2032 atgaccacaagaatacgtc 4345 
2033gaatacgtctacactatca 4355 
2034tttctasattcgaatatca 4398 

2035 gattcgaatatcaaattca 4404 

2036 gaaacaacccagtctcaaa 4441 
2037cccagtctcaaaaggttta 4448 
2038ctcaaaaggtttac1:aata 4454 
2039tcaaaaggtttactaatat 4455 
2040 aaaaggtttactaatattc 4457 
2041 gaaacagcatttgttlgtc 4535 
2042atttgtttgtcaaagaagt 4543 
2043tcaagattgatgggcagtt 4561 
2044 ttcagagtctcttcgttct 4578 
2046cagagtctcttcgttctat 4580 
2046 atgctaaaggcacatatgg 4597 
2047gcacatatggcctgtcttg 4606 
2048gagtccaacctgaggttta 4659 
2049 agtccaacctgaggtttaa 4660 
2050cctacctccaaggcaccaa 4684 
2051 gaagatggaaccctctccc 4722 
2052tgatctgcaaagtggcatc 4754 
2053 gatctgcaaagtggcatca 4755 
2054gcttccctaaagtatgaga 4785 
2055gtatgagaactacgagctg 4796 
2056tctaacaagatggatatga 4860 
2057ctgctgcgttctgaatatc 4899 
2058tcattgaggttcttcagcc 4932 
2059ttctggatcactaaattcc 4955 
2060ccatggtcttgagttaaat 4973 
2061 tcttaggcactgacaaaat 4999 
2062 acaaggcgacactaaggat 5032 
2063tgcaacgaccaacttgaag 5075 
2064 caacttgaagtgtagtctc 5084 
2065 gctgg agaatg agctgaat 5 1 08 
2066 gcag agcttggcctctctg 5 1 27 
2067tctctggggcatctatgaa 5140 
2068tctggggcatctatgaaat 5142 
2069 aacacaatgcaaaattcag 5185 
2070 ctcacagagctatcactgg 5223 
2071 tgggaagtgcttatcaggc 5239 
2072ttcaaggtcagtcaagaag 5295 
2073aatgacatgatgggctcat 5328 
2074gctcatatgctgaaatgaa 5341 
2075 atatgctgaaatgaaattt 5345 
2076tctgaacattgcaggctta 5378 
2077gaacattgcaggcttatca 5381 
2078tgcaggcttatcactggac 5387 



4324 SEQ 
4330SEQ 

4363 SEQ 

4364 SEQ 
4374SEQ 
4417SEQ 
4423 SEQ 
4460 SEQ 
4467SEQ 

4473 SEQ 

4474 SEQ 
4476 SEQ 
4554 SEQ 
4562 SEQ 
4580SEQ 
4597 SEQ 
4599SEQ 
4616SEQ 
4625SEQ 
4678 SEQ 
4679 SEQ 
4703 SEQ 
4741 SEQ 
4773 SEQ 
4774SEQ 
4804SEQ 
4815SEQ 
4879 SEQ 
4918SEQ 
4951 SEQ 
4974SEQ 
4992 SEQ 
5018SEQ 
5051 SEQ 
5094SEQ 
5103SEQ 
5127SEQ 
5146SEQ 
5159SEQ 
5161SEQ 
5204SEQ 
5242 SEQ 
5258SEQ I 
5314SEQ 
5347SEQ 
5360 SEQ I 
5364SEQ 1 
5397SEQ : 
5400SEQ 1 
5406SEQ I 
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2335gcacaccttgacattgcag 
2336atccgctggctctgaagga 
2337acgtccgtgtgccttcata 
2338gacgtccgtgtgccttcat 
23 3 9 tg att atctg aa ttcattc 
2340tgatttacatgatttgaaa 
2341 tgaagtagctgagaaaatc 
2342 ttlgaaaaattctcttttc 
2343taaattcattactcctggg 
2344tattcaaaactgagttgag 
2345 atattcaaaactgag ttga 
2346gaatttgaaagttcgtttt 
2347gacagcatcttcgtgtttc 
2348 actlaaaaaatataaaaat 
2349 aactctcaagtcaagttga 
2350agaagatggcaaatttgaa 

2351 atagcatggacttcttctg 

2352 ccatttgagatcacggcat 
2353caagttggcaagtaagtgc 
2354taaagtgccacttttactc 
2355ttaacagggaagatagact 
2356ttggcaagtaagtgctagg 
2357gggaagaagaggcagcttc 
2358 gatgaggaaactcagatca 
2359tgatgaggaaactcagatc 
2360tctcgtgtctaggaaaagc 

2361 cagcttaagagacacatac 

2362 tcattttccaactaataga 

2363 gatacaagaaaaactgcag 
2364ggctcatatgctgaaatga 
2365ggaaggacaaggcccagaa 
2366 atttttattcctgccatgg 
2367 attttttgcaagttaaaga 

2368 atccatgatctacatttgt 

2369 cttcagggaacacaatgca 
2370gagatgagagatgccgttg 

2371 attctcttttcttttcagc 

2372 cagatacaagaaaaactgc 
2373ttcattcaattgggagaga 
2374atttgtaagaaaatacaga 
2375 ctgaagcattaaaactgtt 
2376 ccagatgctgaacagtgag 
2377 gcctacgttccatgtccca 
2378cttcagtgcagaatatgaa 
2379atgattatctgaattcatt 
2380ttcagccattgacatgagc 
2381 aaatagctattgctaatat 
2382taagaaccagaagatcaga 
2383tgatatcgacgtgaggttc 

2 3 84 g tcctg gattccacatgca 



1107911098 
8569 8588 
9976 9995 
9975 9994 
6479 6498 
6677 6696 
7094 7113 
9206 9225 
1129411313 
1222312242 
1222212241 
9272 9291 
1120611225 
8014 8033 
1341413433 
1198712006 
8865 8884 
9237 9256 
9364 9383 
6182 6201 
9300 9319 
9368 9387 
1228312302 
1225512274 
1225412273 
5969 5988 
6912 6931 
1302413043 
6893 6912 
5340 5359 
1254112560 
1009510114 
1401114030 
6786 6805 
5177 5196 
6231 6250 
9214 9233 
6891 6910 
6491 6510 
6428 6447 
7498 7517 
8141 8160 
1134811367 
1196911988 
6478 6497 
5738 5757 
6694 6713 
1098811007 
1248212501 1 
1184411863 1 
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2079tcaaaacttgacaacattt 541 2 

2080atttacagctctgacaagt 5427 

2081 ctctgacaagttttataag 5435 

2082gttaatttacagctacagc 5460 

2083ttctctggtaactacttta 5483 

2084cctaaaaggagcctaccaa 5588 

2085aaaaggagcctaccaaaat 5591 

2086aggagcctaccaaaataat 5594 

2087ataatgaaataaaacacat 5608 

2088aaaacacatclatgccatc 5618 

2089tgctaaggttcagggtgtg 5678 

2090gagtttagccatcggctca 5697 

2091 gctggcttcagccattgac 5732 

2092 atttcagcaatgtcttccg 6782 

2093 tttcagcaatgtcttccg I: 5783 

2094 ttcag ca atg tcttccg tt 5784 

2095cagcaatgtcttccgttct 5786 

2096tgtcttccgttctgtaatg 5792 

2097gtcttccgttctgtaatgg 5793 

2098atgggaaactcgctctctg 5851 

2099ggagaacatactgggcagc 5871 

2100gttgaaagcagaacctctg 6906 

21 01 gtctaggaaaagcatcagt 5975 

2 1 02 ag catcagtgcagctcttg 5985 
21 OSttgaacacaaagtcagtgc 6001 
21 04gcagacaggcacctggaaa 6038 
2105gaaactcaagacccaattt 6053 
2106acaatgaatacagccagga 6076 
21 07cttggatgcttacaacact 6095 
21 08ttggcgtggagcttactgg 6124 
2109 cacttttactcagtgagcc 6190 
21 lOtttagagatgagagatgcc 6227 
2111 gagaagccccaagaattta 6249 
21 12caattgttgcttttgtaaa 6268 
21 13ttttgtaaagtatgataaa 6278 
2l14ttgtaaagtatgataaaaa 6280 
2115 ttcactccattaacctccc 6307 
21 16ttttgagaccttgcaagaa 6329 
21 17accttgcaagaatattttg 6336 
21 IBtcaatattgatcaatttgt 6415 
21 1 gcagagcagccctgggaaaa 6443 
2 1 20cctgggaaaactcccacag 6452 

21 21 actcccacagcaagctaat 6461 

2 1 22 aattcattcaattggg aga 6489 
2123ttcaattgggagagacaag 6495 
2124aggagaaactgactgctct 6526 
2 1 25actgactgctctcacaaaa 6533 
2 1 26gactgctctcacaaaaaag 6536 
2 1 27 cagacatatatgatacaat 6633 
21 28aatttgatcagtatattaa 6649 



5431 SEQ 
5446 SEQ 
5454SEQ 
5479 SEQ 
5502 SEQ 
5607 SEQ 
5610 SEQ 
5613SEQ 
5627 SEQ 
5637SEQ 
5697 SEQ 
5716SEQ 
5751 SEQ 

5801 SEQ 

5802 SEQ 

5803 SEQ 
5805 SEQ 

5811 SEQ 

581 2 SEQ 
5870 SEQ 
5890SEQ 
5925 SEQ 
'5994 SEQ 
6004 SEQ 
6020 SEQ 
6057SEQ 
6072SEQ 
6095SEQ 
6114SEQ 
61 43 SEQ 
6209 SEQ 
6246 SEQ 
6268 SEQ 
6287SEQ 
6297 SEQ 
6299 SEQ 
6326 SEQ 
6346 SEQ 
6355 SEQ 
6434 SEQ 
6462 SEQ 
6471 SEQ 
6480 SEQ 
6508 SEQ 
6514SEQ 
6545 SEQ 
6552 SEQ 
6555 SEQ 
6652 SEQ 
6668SEQ 
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2385aaattccttgacatgttga 
2386acttaaaaaatataaaaat 

2387cttacttgaattccaagag 
2388gctgcatgtggctggtaac 
2389taaaagattactttgagaa 
2390 ttggcaagtaagtgctagg 
2391 atttacaattgttgctttt 
2392 attacctatg atttctcct 
2393 atgtcaaacactttgttat 
2394gatgaagatgacgactttt 
2395cacaaglcgattcccagca 
2396tgaggtgactcagagactc 
2397gtcagtgaagttctccagc 
2398 cggagcatgggagtgaaat 
2399acggagcatgggagtgaaa 

2400 aacggagcatgggagtgaa 

2401 agaagtgtcttcaaagctg 

2402 cattcaattgggagagaca 
2403 ccattcagtctctcaagac 
2404 cagataaaaaactcaccat 
2405gctgttttgaagactctcc 
2406 cagaattcataatcccaac 
2407actgcaagatttttcagac 
2408 caagaacctgttagttgct 
2409gcacatcaatattgatcaa 
241 Otttcagatggcattgctgc 

241 1 aaatcccatccaggttttc 

24 1 2 tcctttggctgtg ctttgt 

24 1 3 agtgaagttctccagcaag 

24 1 4 ccagaattcataatcccaa 

24 1 5 ggctattgatgttag agtg 

24 1 6 ggcatgatg ctcatttaaa 
241 7taaagccattcagtctctc 
241 Stttaaccagtcagatattg 
241 9tttattgctgaatccaaaa 
2420ttttgagaggaatcgacaa 
2421 gggaaaaaacaggcttgaa 
2422 ttctctctatg g g aaaaaa 
2423caaaagaagcccaagaggt 
2424 acaaagcagattatgttga 
2425 ttttcagaccaactctctg 
2426 ctgtctctgg tcagccag g 
2427attacacttcctttcgagt 
2428tctcttcctccatggaatt 
2429cttggagtgccagtttgaa 
2430agagcttatgggatttcct 
2431 tttlggcaagctatacagt 
2432ctttgtgagtttatcagtc 
2433attggatatccaagatctg 
2434 ttaaaagaaatcttcaatt 



7362 7381 1 
8014 8033 1 

1066610685 1 
5570 5589 1 
7257 7286 1 
9368 9387 1 
6263 6282 1 

1011910138 1 
7057 7076 1 

1215012169 1 
9079 9096 1 
7442 7461 1 
8588 8607 1 



8620 

8619 



8639 1 
8638 1 



8618 8637 
1240412423 
6493 6512 
1296712986 
1220512224 
1080 1099 
8266 8285 
1360413623 
1334313362 
6410 6429 
1160211621 
8029 8048 
9674 9693 
8591 8610 
8265 8284 
6980 6999 
9169 9188 
1296212981 
1017910198 
1364713666 
6350 6369 
9568 9587 
9558 9577 
1294012959 
1182111840 
1361413633 
7716 7735 
1286112880 
10471 10490 
1180011819 
1115511174 
8372 8391 
9687 9706 
1925 1944 
1380713826 
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21 29tatgatttacatgatttga 6675 

21 30tttgaaaatagctattgct 6689 

21 31 ttgaaaatagctattgcta 6690 

2 1 32 aatagctattgctaatatt 6695 
21 SSattattgatgaaatcattg 671 1 
2134aaagtcttgatgagcacta 6739 
2 1 35 aagtcttgatgagcactat 6740 
2136ttgatgagcactatcatat 6745 
21 37taattttagtaaaaacaat 6769 
2138ttttagtaaaaacaatcca 6772 
2 1 39 acatttgtttattgaaaat 6797 
2140attgattttaacaaaagtg 6816 

2141 attttaacaaaagtggaag 6820 

2 1 42 aaatcagaatccag ataca 6880 
2143gaatccagatacaagaaaa 6886 

2 1 44 ttaagagacacatacagaa 691 6 

2 1 45 atccagcacctagctgg aa 6942 
2146tgagcatgtcaaacacttl; 7052 

2 1 47 gagcatgtcaaacactttg 7053 

2148 aaacactttgttataaatc 7062 

2 1 49 tgagaaaatcaatgccttc 7103 
2150tatgaagtagaccaacaaa 7152 
2151 aagtagaccaacaaatcca 71 56 
21 52aagttgaaggagactattc 721 5 
2 1 53 acaagttaagataaaagat 7256 
21 54aagataaaagattactttg 7263 
21 55gattactttgagaaattag 7272 
2 1 56tgagaaattagttggattt 7280 
21 67aaattagttggatttattg 7284 
21 58tggatttattgatgatgct 7292 
21 59tcattgaagatgttaacaa 7345 

2 1 60 cattgaagatgttaacaaa 7346 

2161 attgaagatgttaacaaat 7347 
21 62ttgaagatgttaacaaatt 7348 
2163tgaagatgttaacaaattc 7349 

2 1 64 acatgttgataaagaaatt 7372 

2165 tttgattaccaccagtttg 7398 

2 1 66 caaaatccgtgaggtgact 7433 
21 67aaaatccgtgaggtgactc 7434 
2 1 68 aggtgactcagagactcaa 7444 
21 69gtgaaattcaggctctgga 7465 
21 70gttgcagtgtatctggaaa 7539 
2171 ttaagttcagcatctttgg 7608 
21 72tgaaggccaaattccgaga 7633 
2 1 73 aatgtatcaaatggacatt 7676 
21 74attcagcaggaacttcaac 7692 
21 75acctgtctctggtcagcca 7714 
21 76cctgtctctggtcagccag 771 5 
21 77ggtcagccaggtttatagc 7724 
21 78ccaggtttatagcacactt 7730 



6694 SEQ 
6708 SEQ 
6709 SEQ 
6714SEQ 
6730SEQ 
6758SEQ 
6759SEQ 
6764SEQ 
6788 SEQ 
6791 SEQ 
6816SEQ 
6835 SEQ 
6839 SEQ 
6899 SEQ 
6905 SEQ 
6935SEQ 
6961 SEQ 

7071 SEQ 

7072 SEQ 
7081 SEQ 
7122SEQ 
7171SEQ 
7175SEQ 
7234 SEQ 
7275 SEQ 
7282 SEQ 
7291 SEQ 
7299 SEQ 
7303 SEQ 
7311 SEQ 
7364 SEQ 

7365 SEQ 

7366 SEQ 
7367SEQ 
7368SEQ 
7391 SEQ 
741 7 SEQ 

7452 SEQ 

7453 SEQ 
7463 SEQ 
7484 SEQ 
7558SEQ 
7627 SEQ 
7652 SEQ 
7695 SEQ 
7711 SEQ 
7733SEQ 
7734 SEQ 
7743 SEQ 
7749 SEQ 
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2435tcaatgattatatcccata 
2436 agcacagaaaaaattcaaa 
2437tagcacagaaaaaattcaa 

2438aataaatggagtctttatt 
2439 caataccagaattcataat 
2440 tagtgattacacttccttt 
2441 atagcaacactaaatactt 
2442 atatccaagatgagatcaa 
2443 attgagattccctccatta 
2444tggagtgccagtttgaaaa 

2445 atttcctaaagctggatgt 

2446 cactgttccagttgtcaat 
2447 cttcaaagacttaaaaaat 
2448 tg taccataagccatattt 

2449 ttttctaaacttgaaattc 

2450 ttcttaaacattcctttaa 

2451 ttccaatttccctgtggat 

2452 aaagtgccacttttactca 
2453caaatgacatgatgggctc 

2454 gattatatcccatatgttt 

2455 gaaggaaaagcgcacctca 
2456tttgtggagggtagtcata 
2457tggatgaagatgacgactt 
2458gaataccaatgctgaactt 

2459 atctaaattcagttcttgt 

2460 caaaatagaagggaatctt 

2461 ctaaacttgaaattcaatc 
2462aaatccgtgaggtgactca 
2463caattttgagaatgaattt 
2464 agcatgcctagtttctcca 
2465ttgtagatgaaaccaatga 
2466tttgtagatgaaaccaatg 
2467atttaagtatgatttcaat 
2468aatttaagtatgatttcaa 

246 9 g a atttaag t atg atttca 

2470 aattccctgaagttgatgt 

2471 caaattgaacatccccaaa 
2472 agtccccctaacagatttg 
2473 gagtgaaatgctgtttttt 
2474ttgatgatatctggaacct 
2475 tccaatctcctcttttcac 
2476 tttcaagcaaatgcacaac 
2477 ccaatgctgaactttttaa 
2478 tctcctttctlcatcttca 
2479aatgaagtccggattcatt 

2480 gttgagaagccccaagaat 

2481 tggcaagtaagtgctaggt 

2482 ctggacttctctagtcagg 

2483 gctaaaggagcagttgacc 
2484 aagtccggattcattctg g 



1312013139 
1385613875 

1385513874 
1407614095 
8260 8279 
1285612875 
8761 8780 
1309313112 
1169411713 
1180211821 
1116711186 
9863 9882 
8006 8025 
1008010099 
9057 9076 
9483 9502 
3680 3699 
6183 6202 
5326 5345 
1312513144 
12021 12040 
1032310342 
1214812167 
1016010179 
1132611345 
2069 2088 
9061 9080 
7435 7454 
10411 10430 
9945 9964 
7414 7433 
7413 7432 
1048710506 
1048610505 
1048510604 
1147911498 
8783 6802 
7964 7983 
8630 8649 
1072310742 
8401 8420 
8532 8551 
1016510184 
1020510224 
1101311032 
6246 6265 
9369 9388 
8802 8821 
1052710546 
1101711036 
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ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 



21 79gtttatagcacacttgtca 7734 

2180 acttgtcacctacatttct 774 5 

21 81 ctgattggtggactcttgc 7762 

2 1 82 atgaaagcattggtagagc 7839 
21 SStgaaagcattggtagagca 7840 
21 84gggttcactgttcctgaaa 7860 
2 1 35 tcaagaccatccttgg g ac 7879 
2186ccttgggaccatgcct9cc 7889 
2187ttcaggctcttcagaaagc 7921 
2188ttcagataaacttcaaaga 7996 
2139acttcaaagacttaaaaaa 8005 
21 90atcccatccaggtttfcca 8031 

2191 g aatttaccatccttaaca 8055 

2192 cattccttcctttacaatt 80 8 1 
2193ttgaccagatgctgaacag 8137 
2 1 94 aatcaccctg ccagacttc 82 25 
21 95tgaccttcacataccagaa 331 2 
21 96ttccagcttccccacatct 8331 
2197aagctatacagtattctga 8379 
2198attctgaaaatccaatctc 8391 
2199tttcacattagatgcaaat 8414 
2200caaatgctgacatagggaa 8428 
2201 gagagtccaaattagaagt 8500 
2202agagtccaaattagaagtt 8501 
2203 tctcaattttgattttoaa 85 1 9 
2204 caattttg attttcaagca 8522 
2205aatgcacaactctcaaacc 8541 
2206 agttctccagcaagtacct 859S 
2207agtacctgagaacggagca 6608 
2208tcaaacacagtggcaagtt 8670 
2209acaatcagcttaccctgga 8743 
221Qctggatagcaacactaaat 8757 
221 1 ctgacctgcgcaacgagat 8821 
2212agatgagggaacacatgaa 8921 
2213tcaacttttctaaacttga 9052 
221 4ttctaaacttgaaattcaa 9059 
2215gaaattcaatcacaagtcg 9069 
2216 cactgtttg gagaagggaa 9133 
221 7actgtttggagaagggaag 91 34 
22 1 8 aattctcttttcttttcag 92 1 3 
221 9ttcttttcagcccagccat 9222 
2220 tttgaaagttcgttttcca 9275 

2221 cagggaagatagacttcct 9304 

2222 ataagtacaaccaaaattt 9397 
2223acaacgagaacattatgga 9427 
2224aggaataaatggagaagca 9455 
2225 agcaaatctggatttctta 9470 
2226tcctttaacaattcctgaa 9494 
2227 tttaacaattcctgaaatg 9497 
2228acacaataatcacaactcc 9526 



7753 SEQ 
7764 SEQ 
7781 SEQ 

7858 SEQ 

7859 SEQ 
7879 SEQ 
7898 SEQ 
7908 SEQ 
7940 SEQ 
8015SEQ 
8024SEQ 
8050 SEQ 
8074 SEQ 
81C0SEQ 
81 56 SEQ 
8244 SEQ 
8331 SEQ 
8350SEQ 
8398 SEQ 
8410SEQ 
8433 SEQ 
8447SEQ 
851 9 SEQ 
8520SEQ 
8538 SEQ 
8541 SEQ 
8560SEQ 
8615SEQ 
8627 SEQ 
8689 SEQ 
8762 SEQ 
8776 SEQ 
8840 SEQ 
8940 SEQ 
9071 SEQ 
9078 SEQ 
9088SEQ 
9152SEQ 
9153SEQ 
9232 SEQ 
9241 SEQ 
9294 SEQ 
9323 SEQ 
9416SEQ 
9446 SEQ 
9474 SEQ 
9489 SEQ 
95 13 SEQ 
9516SEQ 
9545 SEQ 



D NO 
D NO 
D NO 
D NO 
D NO 
□ NO 
D NO 
D NO 
D NO 
DNO 
D NO 
DNO 
D NO 
D NO 
D NO 
D NO 
D NO 
DNO 
D NO 
D NO 
D NO 
iD NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 



2485 tg acctgtccattcaaaac 
2486agaaaaaggggattgaagt 
2487 gcaagttaaagaaaatcag 
248 8 gctcatctcctttcttcat 
2489tgctcatctcctttcttca 
24 9 0 tttcaccatag aagg accG 
2491 gtccccctaacagatttga 
2492ggcaccagggctcggaagg 
2493gcttgaaggaattctl:gaa 
2494 tcttcataagttcaatgaa 
2495ttttaacaaaagtggaagt 
2496tggagaagcaaatctggat 
2497tgttgaagtgtctccattc 

2498 aattccaattttgagaatg 

2499 ctgttgaaagatttatcaa 

2500 gaagttctcaattttgatt 

2501 ttcttctggaaaagggtca 
2502agattctcagatgagggaa 
2503tcagatggcattgctgctt 
2504 gagataaccgtgcctgaat 
2505attttgaaaaaaacagaaa 
2506 ttccatcacaaatcctttg 
2507 actitacttcccaactctc 
2508 aactttacttcccaactct 
2609 ttgattcccttttttgaga 

25 1 0tgctgaatccaaaagattg 
251 1 ggtttatcaaggggccatt 
251 2aggttccatcgtgcaaact 
251 3tgctccaggagaacttact 
25 1 4 aactctcaagtcaagttga 
251 5tccattctgaatatattgt 
25 1 6 attttctgaacttccccag 
251 7atctgatgaggaaactcag 
251 8ttcatgtccctagaaatct 
25 1 9 tcaaggataacgtgtttga 
2520ttgatgatgctgtcaagaa 
2521 cgacgaagaaaataatttc 
2522ttccagaaagcagccagtg 

2523 cttccccaaagagaccagt 

2524 ctgattactatgaaaaatt 
2525atggaaaagggaaagagaa 
2526tggaagtgtcagtggcaaa 
2527 aggacctttcaaattcctg 
2528aaatcaggatctgagttat 
2529tccattctgaatatattgt 
2530tgctggaattgtcattcx;t 
2531 taagttctctgtacctgct 
2532ttcaaaacgagcttcagga 
25 33 catttgatttaagtgtaaa 
2534ggagacagcatcttcgtgt 



1367313692 
1027510294 
1401814037 
1020010219 
1019910218 
8951 8970 
7965 7984 
1397013989 

1317513194 

6821 6840 

9464 9483 

9881 9900 

1040610425 

1292412943 

8514 8533 

8876 8895 

8913 8932 

1160411623 

1154411563 

9730 9749 

9662 9681 

1340213421 

1340113420 

1152911548 

1365213671 

1245212471 

1138011399 

1377213791 
1341413433 

1337213391 
1269412713 
12251 12270 
1003010049 
1261012629 

7300 7319 
1355813577 
1249812517 

2890 2909 
1 3630 1 3649 
1348613505 
1037210391 

9840 9859 
1403014049 
1337213391 
1172611746 
1171111730 
1319813217 

9613 9632 



1 



1120311222 1 
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SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 



ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 



2229aagatttctctctatggga 9553 

2230gaaaaaacaggcttgaagg 9570 

2231 ttgaaggaattcttgaaaa 9582 

2232tgaaggaattcttgaaaac 9583 

2233 agctcagtataagaaaaac 9632 

2234tcaaatcctttgacaggca 971 2 

2235atgaascaaaaattaagtt 9781 

2236aattcctggatacactgtt 9851 

2237ttccagttgtcaatgttga 9868 

2238aagtgtctccattcaccat 9886 

2239gtcagcatgcctagtttct 9942 

2240ctgccatgggcaatattac 1 01 05 

2241 tgaataccaatgctgaact 1 01 59 

2242tattgttgctcatctcctt 1 01 93 

2243tgttgctcatctcctttct 1 01 96 

2244tctgtcattgatgcactgc 1 0224 

2245ccacagctctgtctctgag 1 0297 

2246 atltgtggagggtagtcat 1 0322 

2247atatggaagtgtcagtggc 1 0369 

2248tggaaataccaagtcaaaa 1 0445 

2249aagtcaaaacctactgtct 1 0455 

2250actgtctcttcctccatgg 1 0467 

225 1 cttcctccatggaatttaa 1 0474 

2252attcttcaatgctgtactc 1 0504 

2253ttgaccacaagcttagctl 1 0540 

2254 cctcacctcttacttttcc 1 0565 

2255agctgcagggcacttccaa 1 0702 

2256ttccaaaattgatgatatc 1 071 5 

2257gagaacatacaagcaaagc 1 0852 

2258atggcaaatgtcagctctt 1 0889 

2259tggcaaatgtcagctcttg 1 0890 

2260ttgttcaggtccatgcaag 1 0906 

2261 tgttcaggtccatgcaagt 1 0907 

2262 agttccttccatgatttcc 1 0932 

2263 tgctaacactaagaaccag 1 0979 

2264 actaagaaccagaagatca 1 0986 
2266ctaagaaccagaagatcag 1 0987 
2266 cagaagatcagatggaaaa 1 0995 
2267aaaaatgaagtccggattc 11010 
2268 gattcattctgggtctttc 1 1 024 
2269aagaaaaggcacaccttga 1 1071 

2270 aaggacacctaaggttcct 11107 

2271 ccagcattggtaggagaca 11191 
2272cittgtgtacaccaaaaac 1 1 231 
2273 ccatccctgtaaaagtttt 1 1 269 
2274tgatctaaattcagttctt 1 1 324 
2275aagaagctgagaacttcat 1 1424 
2276tttgccctcaacctaccaa 1 1 445 
2277 cttgattcccttttttgag 1 1 528 
2278ttcacgcttccaaaaagtg 1 1 583 



9572 SEQ 
9589SEQ 

9601 SEQ 

9602 SEQ 
9651 SEQ 
9731 SEQ 
9800 SEQ 
9870 SEQ 
9887SEQ 
9905SEQ 
9961 SEQ 

10124SEQ 
10178SEQ 
10212SEQ 
10215SEQ 
10243 SEQ 
10316SEQ 
10341 SEQ 
10388SEQ 
1 0464s EQ 
10474 SEQ 
1048SSEQ 
10493SEQ 
10523SEQ 
10559SEQ 
10584SEQ 
10721 SEQ 
10734SEQ 
10871 SEQ 
10908SEQ 
10909SEQ 
10925SEQ 
10926SEQ 
10951SEQ 
10998SEQ 
11005SEQ 
11006SEQ 
11014SEQ 
11029SEQ 
11043SEQ 
11090SEQ 
11126SEQ 
11210SEQ 
11250SEQ 
11288SEQ 
11 343 SEQ 
11443SEQ 
11 464 SEQ 
11547SEQ 
11602SEQ 



ID NO 
ID NO 

ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID WO 
ID NO 



2535tcccagaaaacctcttctt 

2536 ccttttacaattcattttc 

2537 ttttgagaatgaatttcaa 

2538 gttttggctgataaattca 

2539 gtttgataagtacaaagct 
2540tgcctgagcagaccattga 

2541 aactttgcactatgttcat 

2542 aacacatgaatcacaaatt 

2543 tcaaaacgagcttcaggaa 
2544atgggaagl'ataagaactt 
2545agaaaaggcacaccttgac 

2546 gtaagaaaatacagagcag 

2547 agttgaaggagactattca 
2548 aaggaaacataaactaata 

2549 agaagaaatctgcagaaca 

2550 gcagtagactataagcaga 

2551 ctcagggatctgaaggtgg 

2552 atgaagtagaccaacaaat 

2553 gccacactccaacgcatat 
2564 ttttacaattcattttcca 
2555agacctagtgattacactt 
2556 ccatgcaagtcagcccagt 
2557ttaatcgagaggtatgaag 
2558gagttgagggtccgggaat 

2559 aagcgcacctcaatatcaa 

2560 ggaactattgctagtgagg 

2561 ttgggaagaagaggcagct 
2562gatatacactagggaggaa 
2563gcttggttttgccagtctc 
2564aagaggtatttaaagccat 
2565caagaggtatttaaagcca 
2566cttgggggaggaggaacaa 
2567acttgggggaggaggaaca 

2 568 gg a atctg atgag gaaact 

2569 ctggatgtaaccaccagca 
2570tgatcaagaacctgttagt 
2571 ctgatcaagaacctgttag 
2572ttttcagaccaactctctg 
2573gaatttgaaagttcgtttt 
2574gaaaacctatgccttaatc 
2575tcaaaacctactgtctctt 
2576aggacaccaaaataacctt 
2577tgtcaacaagtaccactgg 
2578gtttttaaattgttg aaag 
2579aaaagggtcatggaaatgg 
2580aagatagtcagtctgatca 
2581 atgagatcaacacaatctt 
2582ttggtacgagttactcaaa 
2583ctcaattttgattttcaag 
2584cactcattgattttctgaa 



3 
3 
3 
3 
3 

1 3 
1 3 



3928 3947 1 3 

1301313032 1 3 

1041410433 1 3 

1128311302 1 3 

9797 9816 1 3 

1168011699 1 3 

1275412773 1 3 

8930 8949 1 3 
1319913218 

4834 4853 
1107211091 

6432 6451 

7216 7235 
12881 12900 
1242312442 

1392013939 1 3 

8187 8206 1 3 

7153 7172 1 3 

1077010789 1 3 

1301513034 1 3 

1285112870 1 3 

1091610935 1 3 

7140 7159 1 3 

1223412253 1 3 

1202812047 1 3 

1064110660 1 3 

12281 12300 1 3 

1273712756 1 3 

2459 2478 1 3 

1295212971 1 3 

1295112970 1 3 

1405814077 1 3 

1405714076 1 3 

1224812267 1 3 

1117811197 1 3 

1333913358 1 3 

1333813357 1 3 

1361413633 1 3 

9272 9291 1 3 

1315813177 1 3 

1045810477 1 3 

7564 7583 1 3 

1236212381 1 3 

1314013159 1 3 

8385 8904 1 3 

1332613345 1 3 

1310213121 1 3 

1263312652 1 3 

8520 8539 1 3 

1268512704 1 3 
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SEQ ID NO 

SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 



2279tgtttcagatggcattgct 1 1600 

2280aatgcagtagccaacaaga 1 1631 

2281 ctgagcagaccattgagat 1 1683 

2282tgagcagaccattgagatt 1 1684 

2283ttgagattccctccattaa 1 1 695 

2284acttggagtgccagtttga 1 1 799 

2285caaatttgaaggacttcag 1 1 996 

2286a0cccagcgttcaccgatc 12048 

2287cagcgttcaccgatctcca 12052 

2288 ctccatctgcgctaccaga 1 2066 

2289 atgaggaaactcagatcaa 1 2256 

2290 aggcagctlctggcttgct 1 2292 

229 1 tgaaagacaacgtgcccaa 12319 

2292tatgattatgtcaacaagt 12354 

2293cattaggcaaattgatgat 12467 

2294ttgactcaggaaggccaag 12576 

2295gaaacctgggatatacact 1 2728 

2296tcctttcgagttaaggaaa 1 2869 

2297 gccattcagtctctcaaga 1 2966 

2298gtgctacgtaatcttcagg 12993 

2299agctgaaagagatgaaatt 1 3057 

2300aatttacttatcttattaa 1 3072 

2301 ttttaaattgttgaaagaa 13142 

2302taatcttcataagttcaat 1 3 1 72 

2303atattttgatccaagtata 1 3271 

2304tgaaatattatgaacttga 1 3303 

2305 caatttctgcacagaaata 1 3434 

2306agaagattgcagagctttc 13501 

2307gaa9aaaataatttctgat 1 3562 

2308ttgacctgtccattcaaaa 13672 

2309tcaaaactaccacacattt 1 3685 

231 Ottttttaaaagaaatcttc 1 3803 

231 1 aggatctgagttattttgc 14035 

2312tttgctaaacttgggggag 14049 



11619SEQ 

11650SEQ 
11702SEQ 
11703SEQ 
11714SEQ 
11818SEQ 
12015SEQ 
1 2067s EQ 
12071 SEQ 
12085SEQ 
12275SEQ 
12311 SEQ 
12338SEQ 
12373SEQ 
1 2486s EQ 
12595SEQ 
12747SEQ 
12888SEQ 
12985SEQ 
13012SEQ 
1 3076s EQ 
13091 SEQ 
131 61 SEQ 
131 91 SEQ 
13290SEQ 
13322 SEQ 
13453 SEQ 
13520 SEQ 
13581 SEQ 
1 3691 SEQ 
13704 SEQ 
13822 SEQ 
14054 SEQ 
14068 SEQ 



ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID MO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 



2585agcagattatgttgaaaca 

2586tcttttcagcccagccatt 

2587atctgatgaggaaactcag 

2588 aatctgatgaggaaactca 

2589ttaatcttcataagttcaa 

2590tcaattgggagagacaagt 

2591 ctgagaacttcatcatttg 

2592 galccaagtalBgttggct 
2593tggacctgcaccaaagctg 
2594tctgatatacatcacggag 
2595 ttg ag ttgcccaccatcat 

2 596 ag caagtctttcctggcct 

2597ttgggagagacaagtttca 

2598actttgcactatgttcata 

2599 atcaacacaatcttcaatg 

2600 cttggtacgagttactcaa 

2601 agtgattacacttcctttc 
2602 tttctgccactgctcagga 
2603 tcttccgttctgtaatggc 

2604 cctgcaccaaagctggcac 

2605 aatttattcaaaacgagct 

2606 ttaaaagaaatcttcaatt 
2607ttctctctatgggaaaaaa 
2608 attgagattccctccatta 
2609tataagcagaagcacatat 

26 1 0 tea accttaatg attttca 

261 1 tattcttcttttccaattg 

26 1 2 g a aatcttcaatttattct 

261 3 atcagttcagataaacttc 

26 1 4 ttttgagaatgaatttcaa 
261 5aaattccttgacatgttga 
2616gaagtgtcagtggcaaaaa 
261 7gcaagggttcactgttcct 
261 8 ctccccaggacctttcaaa 



1182511844 1 3 

9223 9242 1 3 

1225112270 1 3 

1225012269 1 3 

1317113190 1 3 

6496 6515 1 3 

1143011449 1 3 

1327813297 1 3 

1395213971 1 3 

1370313722 1 3 

1165911678 1 3 

3010 3029 1 3 

6500 6519 1 3 

1275512774 1 3 

1310713126 1 3 

1263212651 1 3 

1285712876 1 3 

1351613535 1 3 

5794 5813 1 3 

1395613975 1 3 

1319213211 1 3 

1380713826 1 3 

9558 9577 1 3 

1169411713 1 3 

1392913948 1 3 

8287 8306 1 3 

1382613845 1 3 

1381313832 1 3 

7991 8010 1 3 

1041410433 1 3 

7362 7381 1 3 

1037410393 1 3 

7856 7875 1 3 

9834 9853 1 3 



# = Match Number 

B = Middle Matching Bases 
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Table 9. Selected palindromic sequences jBrom human ApoB 







Source 


Start End 
Index Index 




Match 


Start End # 
Index Index 


B 




SEQ ID NO: 


2619 


ggccattccagaagggaag 


517 


536 SEQ 


DNO: 


3948 cttccgttctgtaatggcc 


oouo 




1 


9 


SEQ ID NO: 


2620 


tg ccatctcgagagttcca 


4107 


41 26 SEQ 


ID NO: 


3949 tgg aactctctccatg g ca 




1 U9U0 


1 


8 


SEQ ID NO: 


2621 


catgtcaaacactttgtta 


7064 


7083 SEQ 


D NO: 


3950 taacaaattccttgacatg 




70ft*? 


1 


8 


SEQ ID NO: 


2622 


tttgttataaatcttattg 


7076 


7095 SEQ 


D NO: 


3951 caataagatcaatagcaaa 


QQQQ 




1 


8 


SEQ ID MO: 


2623 


tctggaaaagggtcatgga 


8888 


8907 SEQ 


ID NO: 


3959tccatgtcccatttacaga 


'1 "1 '^ftzl 
1 loO'H- 


1 1 000 


1 


8 


SEQ ID NO: 


2624 


cagctcttgttcaggtcca 


10908 10927 SEQ 


ID NO: 


3960 tggacctg caccaaagctg 




1 /y 


1 


8 


SEQ ID NO: 


2625 


ggaggttccccagctctgc 


364 


383 SEQ 


ID NO: 


3961 gcagccctgggaaaactcc 




D4/4 




7 


SEQ ID NO: 


2626 


ctgttttgaagactctcca 


1089 


1108SEQ 


ID NO: 


3962tggagggtagtcataacag 


lUooD 




1 


7 


SEQ ID NO: 


2627 


agtggctgaaacgtgtgca 


1305 


1324SEQ 


ID NO: 


3963tgcagagctttctgccact 


loo Id 


1 OOOt3 


1 


7 


SEQ ID NO: 


2623 


ccaaaatagaagggaatct 


2076 


2095 SEQ 


ID NO: 


3964 agattcctttgccttttgg 




/I no"7 


1 


7 


SEQ ID NO: 


2629 


tgaagagaagattgaattt 


3628 


3647 SEQ 


ID NO: 


3965 aaattctcttttcttttca 


ooon 




1 


7 


SEQ ID NO: 


2630 


39^99^99caacaccagca 


4238 


4257 SEQ 


ID NO: 


3966 tgctagtgaggccaacact 


1 UOG / 


1 [JOfK> 


1 


7 


SEQ ID NO: 


2631 


aaggctccacaagtcatca 


5958 


5977 SEQ 


ID NO: 


3967tgatgatatctggaacctt 


1 nvQO 
ID / 


1 U/O 1 


1 


7 


SEQ ID NO: 


2632 


gtcagccaggtttatagca 


7733 


7752 SEQ 


ID NO: 


3968 tgctaagaaccttactgac 


f /oy 


7Qf\Q 

/oUo 


1 


7 


SEQ ID NO: 


2633 


tgatatctggaaccttgaa 


10735 10754 SEQ 


ID NO: 


3969 ttcactgttcctgaaatca 


f 0 / 1 




1 


7 


SEQ ID NO: 


2634 


gtcaagttgagcaatttct 


13431 13450 SEQ 


ID NO: 


3970 agaaaaggcacaccttgac 


1 lUOU 


1 1 uyy 


1 


7 


SEQ ID NO: 


2635 


atccagatggaaaagggaa 


13488 13507 SEQ 


ID NO: 


3971 ttccaatttccctgtggat 


ODOO 


OiVl 


1 


7 


SEQ ID NO: 


2636 


atttgtttgtcaaagaagt 


4551 


4570 Qpn 


ID NO: 


3972 acttcagagaaatacaaat 




1 1 4ZO 


4 


6 


SEQ ID NO: 


2637 


ctg gaaaatgtcagcctgg 


212 


231 Qpn 


ID NO: 


3973 ccagacttcxgtttaccag 


8243 


8262 


2 


6 


SEQ ID NO: 


2638 


accaggaggttcttcttca 


1737 


1756 SEQ 


ID NO: 


3974 tgaagtgtagtctcctggt 


5097 


5116 


2 


6 


SEQ ID NO: 


2639 


aaagaagttctgaaagaat 


1964 


1983SEQ 


ID NO: 


3975 attccatcacaaatccttt 


9669 


9688 


2 


6 


SEQ ID NO: 


2640 


gctacagcttatggctcca 


3578 


3597 SEQ 


ID NO: 


3976 tggatctaaatgcagtagc 


11631 


11650 


2 


6 


SEQ ID NO: 


2641 


atcaatattgatcaatttg 


6422 


6441 SEQ 


ID NO" 


3977 caaagaagtcaagattgat 


4561 


4580 


2 


6 


SEQ ID NO: 


2642 


gaattatcttttaaaacat 


7334 


7353 SEQ 


ID NO: 


3978 atgtgttaacaaaatattc 


11502 

1 1 ^^^^ 


1 1 521 


2 


6 


SEQ ID NO: 


2643 


cgaggcccgcgctgctggc 


138 


157 SEQ 


ID NO: 


3979 g ccag aagtg ag atcctcg 


3515 


3534 


1 


6 


SEQ ID NO: 


2644 


acaactatgaggctgagag 


279 


298 SEQ 


ID NO: 


3980 ctctgagcaacaaatttgt 


10317 


10336 


1 


6 


SEQ ID NO: 


2645 


gctgagagttcxagtggag 


290 


309SEQ 


ID NO 


3981 ctccatggcaaatgtcagc 


10893 


10912 


1 


6 


SEQ ID NO: 


2646 


tgaagaaaaccaagaactc 


456 


475 SEQ 


ID NO 


3982 gagtcattgaggttcttca 


4937 


4956 


1 


6 


SEQ ID NO: 


2647 


cctacttacatcctgaaca 


566 


585sEQ 


ID NO 


; 3983tgttcataagggaggtagg 


12774 


12793 


1 


6 


SEQ ID NO: 


2648 


ctacttacatcctgaacat 


567 


586 SEQ 


ID NO 


; 3984atgttcataagggaggtag 


12773 


12792 


1 


6 


SEQ ID NO: 


2649 


gagacagaagaagccaagc 


623 


642sEQ 


ID NO 


3985 gcttggttttgccagtctc 


2467 


2486 


1 


6 


SEQ ID NO: 


2550 


cactcactttaccgtca ag 


679 


698 SEQ 


ID NO 


3986cttgaacacaaagtcagtg 


OUUO 




1 


6 


SEQ ID NO: 


2651 


ctgatcagcagcagccagt 


830 


849 SEQ 


ID NO 


; 3987actgggaagtgcttatcag 






1 


6 


SEQ ID NO: 


2652 


actggacgctaagaggaag 


862 


881 SEQ 


ID NO 


; 3988cttccccaaagagaccagl 




901 7 

zy 1 / 


1 


6 


SEQ ID NO: 


2653 


agaggaagcatgtggcaga 


873 


892 SEQ 


ID NO 


; 3989tctggcatttactttctct 






1 


6 


SEQ ID NO: 


2554 


tgaagactctccaggaact 


1095 


1114SEQ 


ID NO 


3990 agttgaaggagactattca 






1 


6 


SEQ ID NO; 


2655 


ctctgagcaaaatatccag 


1129 


11 48 SEQ 


ID NO 


: 3991 ctggttactgagc^gagag 


1169 


1188 


1 


6 


SEQ ID NO: 


2656 


atg aagcagtcacatctct 


1197 


1216SEQ 


ID NO 


; 3992agagctgccagtccttcat 


10024 


10043 


1 


6 


SEQ ID NO: 


2657 


ttgccacagctgattgagg 


1217 


1236 SEQ 


ID NO 


; 3993cctcctacagtggtggcaa 


4230 


4249 


1 


6 


SEQ ID NO: 


2658 


agctgattgaggtgtccag 


1224 


1243 SEQ 


ID NO 


; 3994ctggattccacatgcagct 


11855 


11874 


1 


6 


SEQ ID NO: 


2659 


tgctccactcacatcctcc 


1286 


1305 SEQ 


ID NO 


. 3995ggaggctttaagttcagca 


7609 


7628 


1 


6 


SEQ ID NO: 


2660 


tgaaacgtgtgcatgccaa 


1311 


1330 SEQ 


ID NO 


3996ttgggagagacaagtttca 


6508 


6527 


1 


6 


SEQ ID NO: 


2661 


gacattgctaattacctga 


1511 


1530 SEQ 


ID NO 


. 3997tcagaagctaagcaatgtc 


7240 


7259 


1 


6 


SEQ ID NO: 


2662 


ttcttcttcagactttcct 


1746 


1765 SEQ 


ID NO 


• 3998aggagagtccaaattagaa 


8506 


8625 


1 


6 


SEQ ID NO: 


2663 


ccaatatcttgaactcaga 


1911 


1930 SEQ 


ID NO 


. 3999tctgaattcattcaattgg 


6493 


6512 


1 


6 


SEQ ID NO: 


2664 


aaagttagtgaaagaagtt 


1954 


1973 SEQ 


ID NO 


. 4000aactaccctcactgccttt 


2140 


2159 


1 


6 


SEQ ID NO: 


2665 


aagttagtgaaagaagttc 


1955 


1974 SEQ 


ID NO 


: 4001 gaacctctggcatttactt 


5924 


5943 


1 


6 


SEQ ID NO: 


2666 


aaag aagttctgaaagaat 


1964 


1983 SEQ 


ID NO 


: 4002 attctctggtaactacttt 


5490 


5509 


1 


6 
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SEQ ID NO: 


2667 






'^'^'^^SEQ ID NO: 


^nn*^ nafrttanacactnacaaa 


50Q5 


5094 




6 


SEQ ID NO: 


2668 






z^uagEQ ID NO: 


^nn^t+fanrf^atrrinr'traapa 




5797 




6 


SEQ ID NO: 


2669 


cay y aagyg cioaaay aa i 




^ooasEQ ID NO: 


A(\C\^ attf^ptttaacaattcpffi 
*Tvww diiVrfwiUtuowaci iiwwiy 


9500 


9519 




6 


SEQ ID NO: 


2670 




2570 


zoowQEQ ID NO: 


4nnR fattra^ttaaca attcr^f 


gdP9 


9518 




6 


SEQ ID NO: 


2671 


y cic2 y y y (jicaoay oaiy cil» 




^oa 1 SEQ ID NO: 


4^^7nl■^*anff'Hca^^ptpttp 


7Q29 


7941 

1 1 




6 


SEQ ID NO: 


2672 




2580 


^oyw3EQ ID NO: 


^vjuoaydoyyaiyyi^aiiiiiiy 


1 4008 


14027 




6 


SEQ ID NO: 


2673 


Caiy y ay raaiyt^LjlLiy CSa 


if-U 1 1 


ziDougEQ ID NO: 


^.nOQ Hna n?^ n r!03i i=j a n tnp3 tn 


7197 


7146 




6 


SEQ ID NO: 


2674 






^/^UDSEQ ID NO: 










6 


SEQ ID NO: 


2675 




9RQ9 


i I SEQ ID NO: 


H'U 1 1 OLtriirfiyy yyi/dUafLdiyd 


5147 


6166 




6 


SEQ ID NO: 


2676 




O 1 / o 


^ lO'^SEQ ID NO: 


<d,n^ 9iH"f»tr'asinsir*f*anflnartnt 
HrU 1 ^oiLFL(.<rdcaydoudOdydyy I 


12^84 


1^003 




6 


SEQ ID NO: 


2677 


999 C3399C9tClLGC8Scl 


OO f t3 


ooa^ SEQ ID NO: 


fu 1 0 LUiydddycii,«ciidVifyiyVi>i^o 


19*^95 

1 


19^44 




6 


SEQ ID NO: 


2678 


S C UCL y g c! G u 1 1 W cd ^ d edi,^ 




cjq- i^SEQ ID NO: 


•tL/ 1 1 ly Liy L>iaeiy^i.LVi>dyy y I 




5709 




6 


SEQ ID NO: 


2679 


aiygycgacciaagiiyty 




c>«H-ODSEQ ID NO: 


ifin '1 ^ i^siosiSsa4'f antHv^st r^^^cst 


ftQ4Q 


fiQRR 




S 


SEQ ID NO: 


2680 




00:^0 


'^t>403EQ ID NO: 




0000 


000 / 




6 


SEQ ID NO: 


2681 




000*r 


^od^SEQ ID NO: 


1 r LllllLyydddiyL<L(dLiy 


AR'=>1 






6 
\j 


SEQ ID NO: 


2682 


giag aiaccaaaaaaaig ^ 


OQOO 


Jbu/ SEQ ID NO: 


*-ru 1 oKJdiyiy di.ygyioi.ddC 










SEQ ID NO: 


2683 


g cTicay iicaniy g aci 


1 / 


40Jt)SEQ ID NO: 


4U 1 ody loddyddyy dL^iiddyo 










SEQ ID NO: 


2684 


n {"tf t o /^ ^ /I 

iTigiiLy icaaayaagic 




1 SEQ ID NO: 


^ w y d 0 1 L 0 d y cs y d c! d L d d d d 




1 1497 




6 


SEQ ID NO: 


2685 


Iigiuy Luaaay aayiCa 




'+3/'ij:SEQ ID NO: 


'-rWt. 1 lyciOLLOdy dydddLdv^dd 


1 1407 


1 149R 




6 


SEQ ID NO: 


2686 


i99caai999^^^^'-^9^^ 




^^'^SEQ ID NO: 


^rr?*? an^nan aatr'ar*/^r'tm^r*a 
H-U^^ dy uy dy dd LUdUUv-riy vA«d 


R957 


R94R 






SEQ ID NO: 


2687 




'=iQ9'=» 


oa^'tSEQ ID NO: 


*tu^o d d d y y d y d ly LCfd dy y y 11 


10fi07 


10626 




6 


SEQ ID NO: 


2688 


i« el I L let Olllirf U# L^d a 




owoosEQ ID NO: 


4n24traf ttn 9 a 9 a 9 afa 9 ata 


7034 


7053 




6 


SEQ ID NO: 


2689 


delay LOay ly ouv^iy oiica 


\JVJ 1 r 


DUODSEQ ID NO: 


4n9fitaan apicpf tar*tn a f*tit 


7792 


781 1 




6 


SEQ ID NO: 


2690 


icccaiiiiug ay a ecu 


QOOU 


oo^ySEQ ID NO: 


4n9R a a n n arH'fr'ann a aln n n a 
*tL/i£U ddy y doiiody y dd ly y y d 


19019 


190*^1 




6 


SEQ ID NO: 


2691 


U d lOcI cl Ld L ly d LOd d 11 1 


R491 


DH'tusEQ ID NO: 


4097 aflattaaaaantf^Hnatfi 


6740 


5759 




6 


SEQ ID NO: 


2692 


Idddy dldy LLcliy dllla 


RR7'^ 


ooa^SEQ ID NO: 


4098 taaacinaaaanttnnttta 


9027 


9046 




6 


SEQ ID NO: 


2693 


Id L ly d ly ddd lUei I ly da 




o/*fusEQ ID NO: 


409Qtfcaaaaacttaa9a99ta 


8015 


3034 




6 


SEQ ID NO: 


2694 


d ly d LVr LdUdLLiyiLldl 


6798 


6817SEQ ID NO: 


40 9t99an999tt99arite9t 

*Twww CI icmciu aci 01 icicicim ivcil 


7388 


7407 




6 


SEQ ID NO: 


2695 


St (~t zi n o r* SI sit 9 sin SI sifsrf 

ay dy dUauaxauay aalal 


6927 


6946SEQ ID NO: 


40*^1 atefaHntnantnf*fH"fH" 


13390 


13409 




6 


SEQ ID NO: 


2696 


na/^a<^atapanaatatana 
y ciLfCiLfCiicii^ciy aci i.«i.ciy ci 


6930 


6949SEQ ID NO: 


4032 tctaaattcaattctf ate 

^T^^^^^B ^X^LCIdd CLwdXJ bL\^CL>4 


11335 


11354 




6 


SEQ ID NO: 


2697 


a/^r'atntr*a a apar*tttnt 

ety Odiy IL^ddctL'ClL'llLy L 


7062 


7081 SEQ ID NO: 


4m a r a a a a tea nta pprtnrt 


6015 


6034 




6 


SEQ ID NO: 


2698 


1 1 1 1 idy dy y dddu o ddy y 


7523 


7542SEQIDNO: 


4m4prtt1'fit'fi+apannaaaaa 


11 238 


11257 




6 


SEQ ID NO: 


2699 


iiiidydyydddiArddyy^ 


7524 


7543 SEQ ID NO: 


40 *^ R fi pptHrifn tapa pp a a a a 
'tv<'w y i«wiii^iy idodwwciddd 


11237 


11256 




6 


SEQ ID NO: 


2700 


nn aanat stn£tr4trv^f nan 
u y d dy d id^dv^i i VA^ dd 


9315 


9334 SEQ ID NO: 


4036 Itcac aa atactattttcc 


12832 


12851 




6 


SEQ ID NO: 


2701 


SI t tf f^t St o tr^r^f^ SI n 

vauiy Liioiy dy Luuudy 


9342 


9361 SEQ ID NO: 


40*^7 pinnnsipptappa an antn 


12531 


12550 




6 


SEQ ID NO: 


2702 


LrdUdddiouuLyyt^iy ly 


9676 


9695 3EQ ID NO: 


40*^^ papatttpaannaattntn 


10071 


10090 

1 w w C/ w 




6 


SEQ ID NO: 


2703 


llUUiyy aldUdUiy ULfWr 


9861 


9880SEQ ID NO: 


'rvjOk? y yddLriy iiydLviUdy y aci 


12577 


195Q6 




6 


SEQ ID NO: 


2704 


9 dd d ICiv^ddy LtllldCl 


10050 100S9SEQIDNO: 


4040 5ifisin/*i^9rtrtfpnanf*'tttp 
'^\j'^Lr dy dy oody y dy ^iii^^ 


1 1052 


1 1071 
1 1 w 1 1 




6 


SEQ ID NO: 


2705 


-ttt/H-fr'afnttrat/H-nt 

utciicaicucaiciyi 


1021810237SEQIDNO: 


A.r\A.'\ si^an/Hnsiaananflifoaaa 
1 dUdyoiydddydydiyddd 




1*^089 




u 


SEQ ID NO: 


2706 


li^dvAf y(.f idddy y dy v^dy 


1052910548SEQIDNO: 


4049 ptnr^apnpttfnanntann 
'tU't^ oiyirfdoyoLiLy dyy Ldy d 


1176Q 

1 1 / US7 


11788 
■ If 00 




6 


SEQ ID NO: 


2707 


ctaccgctaaag gagcagt 


10530 10549 SEQ |D NO: 


4043 actgcacgctttgaggtag 


11768 


11787 




6 


SEQ ID NO: 


2708 


agggcctctttttcaccaa 


10839 IO8683EQ ID NO: 


4044ttggccaggaagtggccct 


10965 


10984 




6 


SEQ ID NO: 


2709 


ttctccatccctgtaaaag 


11273 


11292SEQ ID NO: 


4045 ctttttcaccaacggagaa 


10846 


10865 




Q 


SEQ ID NO: 


2710 


gaaaaacaaagcagattat 


11824 11 843 SEQ |d NO: 


4046 ataaactgcaagatttttc 


13608 


13627 




6 


SEQ ID NO: 


2711 


actcactcattg attttct 


12690 12709SEQ ID NO: 


4047 agaaaatcaggatctgagt 


14035 


14054 




6 


SEQ ID NO: 


2712 


taaactaatagatgtaatc 


12898 12917SEQ ID NO: 


4048 gattaccaccagcagttta 


13586 


13605 




6 


SEQ ID NO: 


2713 


caaaacgagcttcaggaao 


13208 13227SEQ ID NO: 


4049 dtcgtgaagaatattttg 


13268 


13287 




6 


SEQ ID NO: 


2714 


tggaataatgctcagtgtt 


2374 


2393SEQ ID NO: 


4050aacacttacttgaattcca 


10670 


10689 


3 


5 


SEQ ID NO: 


2715 


gatttgaaatccaaagaag 


2408 


2427SEQ ID NO: 


4051 cttcagagaaatacaaatc 


11410 


11429 


3 


5 


SEQ ID NO: 


2716 


atttgaaatccaaagaagt 


2409 


2428 SEQ ID NO: 


4052 acttcagagaaatacaaat 


11409 


11428 


3 


5 



288 
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SEQ ID NO: 


2717 


atcaacagccgcttctttg 


998 


1017SEQ 


ID NO: 


4053 caaagaagtcaagattgat 


4561 


4530 


2 


5 


SEQ ID NO 


2718 


tgttttg aag actctccag 


1090 


1109SEQ 


ID NO: 


4054 ctg ga a agttaaaacaaca 


6963 


6932 


2 


5 


SEQ ID NO 


2719 


cccttctg atagatgtg gt 


1332 


1351 SEQ 


ID NO: 


4055 accaaagclggcaccaggg 


13969 


13988 


2 


5 


SEQ ID NO 


2720 


tgagcaagtgaagaacttt 


1876 


1895SEQ 


ID NO: 


4056 aaagccattcagtctctca 


12971 


12990 


2 


5 


SEQ ID NO. 


2721 


atttgaaatccaaagaagt 


2409 


2428 SEQ 


ID NO: 


4057 acttttctaaacttgaaat 


9063 


9032 


2 


5 


SEQ ID NO 


2722 


atccaaagaagtcccggaa 


2416 


2435QEQ 


ID NO: 


4058 ttccggggaaacctgggat 


12729 


12748 


2 


6 


SEQ ID NO 


: 2723 


agagcctacctccgcatct 


2438 


2457SEQ 


ID NO' 


4059 agatggtacgttagcctct 


11929 


11948 


2 


5 


SEQ ID NO 


2724 


aatg cctttgaactcccca 


2618 


2637 SEQ 


ID NO. 


; 4060tgggaactacaatttcatt 


7020 


7039 


2 


5 


SEQ ID NO 


: 2725 


gaagtccaaattccggatt 


3305 


3324 SEQ 


ID NO: 


: 4061 aatcttcaatttattcttc 




13842 


2 


5 


SEQ ID NO 


2726 


tgcaagcagaagccagaag 


3504 


3523 g EQ 


ID NO 


4062 cttcaggttccatcgtgca 


1 1 384 


11403 


2 


5 


SEQ ID NO 


2727 


gaagagaagattgaatttg 


3629 


3648SEQ 


ID NO 


4063 caaaacctactgtctcttc 


10467 


10486 


2 


5 


SEQ ID NO 


2728 


atgdaaaggcacatatgg 


4605 


4624SEQ 


ID NO 


4064 ccatatgaaagtcaagcat 


12664 


12683 


2 


5 


SEQ ID NO 


2729 


tccctcacctccacctctg 


4745 


4764 SEQ 


ID NO 


; 4065cagattctcagatgaggga 


8920 


8939 


2 


5 


SEQ ID NO 


2730 


atttaca g ctctga ca agt 


5435 


5454SEQ 


ID NO 


4066 acttttctaaacttgaaat 


9063 


9082 


2 


5 


SEQ ID NO 


. 2731 


aggagcctaccaaaataat 


5602 


5621 SEQ 


ID NO: 


; 4067attatgttgaaacagtcct 


11838 


11857 


2 


5 


SEQ ID NO 


2732 


aaagctgaagcacatcaat 


6409 


6428SEQ 


ID NO 


4068 attgttgctcatctccttt 


10202 


10221 


2 


5 


SEQ ID NO 


2733 


ctgctggaaacaacgagaa 


9426 


9445SEQ 


ID NO: 


4069 ttctg attaccaccagcag 


13582 


13601 


2 


5 


SEQ ID NO 


2734 


ttgaaggaattcttgaaaa 


9590 


9609SEQ 


ID NO, 


4070 ttttaaaagaaatcttcaa 


13813 


13832 


2 


5 


SEQ ID NO 


2735 


gaagtaaaagaaaattttg 


10751 


10770SEQ 


ID NO, 


4071 caaaacctactgtctcttc 


10467 


10486 


2 


5 


SEQ ID NO 


2736 


tgaagaagatggcaaattt 


11992 12011 SEQ 


ID NO 


4072 aaatgtcagctcttgttca 


10902 


10921 


2 


5 


SEQ ID NO 


2737 


ag g atctgagttattttg c 


14043 14062 


ID NO 


4073 gcaagtcagcccagttcct 


10928 


10947 


2 


5 


SEQ ID NO 


2738 


gtgcccttctcggttgctg 


26 


45SEQ 


ID NO 


4074 cagccattgacatgagcac 


5748 


5767 


1 


5 


SEQ ID NO 


2739 


ggcgctgcctgcgctgctg 


154 


173SEQ 


ID NO. 


4075 cagctccacagactccgcc 


3070 


3089 


1 


5 


SEQ ID NO 


2740 


ctgcgctgctgctgctgct 


162 


181 SEQ 


ID NO: 


4076 agcagaaggtgcgaagcag 


3232 


3251 


1 


5 


SEQ ID NO 


2741 


gctgctggcgggcgccagg 


178 


197SEQ 


ID NO. 


4077 cctggattccacatgcagc 


11854 


11873 


1 


5 


SEQ ID NO 


2742 


aagaggaaatgctggaaaa 


201 


220sEQ 


ID NO' 


4078tttttcttcactacatctt 


2592 


2611 


1 


5 


SEQ ID NO 


2743 


ctggaaaatgtcagcctgg 


212 


231 SEQ 


ID NO: 


4079 ccagacltccacatcccag 


3923 


3942 


1 


5 


SEQ ID NO 


2744 


tggagtccctgggactgct 


304 


323 SEQ 


ID NO: 


4080 agcatgcctagtttclcca 


9953 


9972 


1 


5 


SEQ ID NO 


2745 


ggagtccctgggactgctg 


305 


324SEQ 


ID NO: 


4081 cagcatgcctagtttctcc 


9952 


9971 


1 


5 


SEQ ID NO 


2746 


tgggactgctgattcaaga 


313 


332SEQ 


ID NO 


4082 tcttccatcacttgaccca 


2050 


2069 


1 


5 


SEQ ID NO 


2747 


ctgctgattcaagaagtgc 


318 


337sEQ 


ID NO 


4083gcacaccttgacattgcag 


11087 


11106 


1 


5 


SEQ ID NO 


2748 


tgccaccaggatcaactgc 


334 


353sEQ 


ID NO 


; 4084gcaggctgaaGtggtggca 


2725 


2744 


1 


5 


SEQ ID NO 


2749 


gccaccaggatcaactgca 


335 


354SEQ 


ID NO 


4085tgcaggctgaactggtggc 


2724 


2743 


1 


5 


SEQ ID NO 


2760 


tgcaaggttgagctggagg 


350 


369SEQ 


ID NO 


4086 cctccacctctgatctgca 


4752 


4771 


1 


5 


SEQ ID NO 


2751 


caaggttgagctggaggtt 


352 


371 SEQ 


ID NO 


4089 aacccctacatgaagcttg 


13763 


13782 


1 


5 


SEQ ID NO: 


2752 


ctctgcagcttcatcctga 


377 


396SEQ 


ID NO 


4090tcaggaagcttctcaagag 


13219 


13238 


1 


5 


SEQ ID NO 


2753 


cagcttcatcctgaag acc 


382 


401 SEQ 


ID NO 


4091 ggtcttgagttaaatgctg 


4985 


5004 




5 


SEQ ID NO 


2754 


gcttcatcctgaagaccag 


384 


403SEQ 


ID NO. 


4092 ctggacgctaagaggaagc 


863 


882 


1 


5 


SEQ ID NO 


2765 


tcatcctgaagaccagcca 


387 


406SEQ 


ID NO 


; 4093tggcatggcattatgatga 


3612 


3631 




5 


SEQ ID NO; 


2756 


gaaaaccaagaactctgag 


460 


479SEQ 


ID NO 


4094 ctcaaccttaatgattttc 


8294 


8313 




5 


SEQ ID NO: 


2757 


agaactctgaggagtttgc 


468 


487SEQ 


ID NO 


4095 gcaagdatacagtattct 


8385 


8404 




5 


SEQ ID NO' 


2758 


tctgaggagtttgctgcag 


473 


492 SEQ 


ID NO 


4096ctgcaQgggatcccccaga 


2534 


2553 




5 


SEQ ID NO 


2759 


tttg ctgcagccatgtcca 


482 


501 SEQ 


ID NO 


4097tggaagtgtcagtggcaaa 


10380 


10399 




5 


SEQ ID NO 


2760 


caagagggg catcatttct 


586 


605SEQ 


ID NO 


4098agaataaatgacgttcttg 


7043 


7062 




5 


SEQ ID NO 


2761 


tcactttaccgtcaagacg 


682 


701 SEQ 


ID NO 


; 4099cgtctacactatcatgtga 


4368 


4387 




5 


SEQ ID NO: 


2762 


tttaccgtcaagacgagga 


686 


705SEQ 


ID NO 


4 1 00 tccttgacatgttgataaa 


7374 


7393 




5 


SEQ ID NO: 


2763 


cactggacgctaagaggaa 


861 


880SEQ 


ID NO 


41 01 ttccagaaagcagccagtg 


12506 


12525 




5 


SEQ ID NO 


2764 


aggaagcatgtggcagaag 


875 


894SEQ 


ID NO 


4 1 02 cttcatacacattaatcct 


9996 


10015 




5 


SEQ ID NO 


2765 


caaggag caacacctcttc 


901 


920SEQ 


ID NO 


; 4103gaagtagtactgcatcttg 


6843 


6862 




5 
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SEQ ID NO: 


2766 


acagactttgaaacttgaa 


967 




ID NO: 


41 04ttcaattGttcaatgctgt 


10508 


10527 


1 5 


SEQ ID NO: 


2767 


tgatgaagcagtcacatct 


1 1 95 


121 4 SEQ 


ID NO: 


41 05 agatttgaggattccatca 


7984 


8003 1 


1 5 


SEQ ID NO: 


2768 


aQcagtcacatctcicttg 


1201 


1220 SEQ 


ID NO: 


41 06 caaggagaaactgactgct 


6532 


6551 1 


1 5 


SEQ ID NO: 


2769 


ccag coccatcacntaca 


1239 


1258 SEQ 


ID NO: 


41 07tgtagtctcctggtgctgg 


5102 


5121 1 


1 5 


SEQ ID NO: 


2770 


ctccactcacatcctccag 


1288 


1307SEQ 


ID NO: 


41 OSctggagcttagtaatggag 


8717 


8736 


I 5 


SEQ ID NO: 


2771 


catgccaacccccttctg a 


1322 


1341 SEQ 


ID NO: 


41 09tcagatgagggaacacatg 


8927 


8946 


i 5 


SEQ ID NO: 


2772 


gagagatcttcaacatggc 


1398 


1417SEQ 


ID NO: 


Ji A ^ X 111 

4110 gccaccctggaactctctc 


10877 


10896 


1 5 


SEQ ID NO: 


2773 


tcaacatggcgagggatca 


1407 


1426SEQ 


ID NO: 


4111 tgatcccacclctcattga 


2973 


2992 


1 5 


SEQ ID NO: 


2774 


ccaccttgtatgcgctgag 


1437 


1456 SEQ 


ID NO: 


41 12ctcagggatctgaaggtgg 


8195 


8214 ' 


1 5 


SEQ ID NO: 


2775 


gtcaacaactatcataaga 


1463 


1482SEQ 


ID NO: 


AAA A. _C J. -1 f_ a i 

41 13tcttgagttaaatgctgac 


4987 


5006 ' 


1 5 


SEQ ID NO: 


2776 


tg g acatigctaattacct 


1509 


1528 SEQ 


ID NO: 


41 14aggtatattcgaaagtcca 


12807 


12826 


1 5 


SEQ ID NO: 


2777 


ggacatlgctaattacctg 


^ r" ^ 

1510 


1529SEQ 


ID NO: 


A ^ ji f"* rail a 

41 IScaggtatattcgaaagtcc 


12806 


12825 


i 5 


SEQ ID NO: 


2778 


ttctgcgggtcattggaaa 


1581 


1600SEQ 


ID NO: 


AAA J_1.J _!. 

4116 tttcacatgccaaggagaa 


6522 


6541 


I 5 


SEQ ID NO: 


2779 


ccag aactcaagtcttcas 


1628 


1647SEQ 


ID NO: 


41 17ttgaagtgtagtctcctgg 


5096 


5115 ' 


i 5 


SEQ ID NO' 


2780 


agtcttcaatcctgaaatg 


1638 


1657 SEQ 


ID NO: 


41 IScatttctgattggtggact 


7765 


7784 


1 5 


SEQ ID NO 


2781 


tgagcaagtg aag aacttt 


1876 


1895SEQ 


ID NO: 


A Ji A <^ A. i J. ■ ■ a 

41 19aaagtgccscttttactca 


6191 


6210 


i 5 


SEQ ID NO 


2782 


agcaagtgaagaactttgt 


1878 


1897SEQ 


ID NO. 


41 20 acaaagtcagtgccctgct 


6015 


6034 


i 5 


SEQ ID NO 


2783 


tctgaaa g aatctcaa ctt 


1972 


1991 SEQ 


ID NO 


4121 aagtccataatggttcaga 


12819 


12838 


I 5 


SEQ ID NO 


2784 


actgtcatggacttcagaa 


1 994 


2013SEQ 


ID NO 


4 1 22 ttctgaatatattgtcagt 


13384 


13403 


1 5 


SEQ ID NO 


2785 


acttga(3ccagcctcag CO 


2059 


2078 SEQ 


ID NO 


41 23ggctcaccctgagagaagt 


12399 


12418 


1 5 


SEQ ID NO 


2786 


tccaaataactaccttcct 


2104 


2123SEQ 


ID NO 


4 1 24 aggaagatatgaagatgga 


4720 


4739 


1 5 


SEQ ID NO. 


2787 


actaocctcactgoctttg 


2141 


Oil on 

21 60 SEQ 


ID NO 


4125caaatttgtggagggtagt 


10327 


10346 


i 5 


SEQ ID NO 


2788 


ttgg atttg cttcagctga 


2157 


2176SEQ 


ID NO 


41 26tcagtataagtacaaccaa 


9400 


9419 


I 5 


SEQ ID NO 


2789 


ttg g aa g ctcttlttg g g a 


2219 


2238SEQ 


ID NO 


4 1 27 tcccgattcacgcttccaa 


11585 


1 1 604 


I 5 


SEQ ID NO 


2790 


ggaagctctttttgggaag 


2221 


2240SEQ 


ID NO 


; 41 28 cttcagaaagctaccttcc 


7937 


7956 


1 5 


SEQ ID NO 


2791 


tttitcccagacagtgtca 




2265SEQ 


ID NO 


4 1 29 tg accttctctaagcaaaa 


4864 


4903 


1 5 


SEQ ID NO 


2792 


ag acagtgtcaacaaagct 


2254 


2273SEQ 


ID NO 


41 SOagcttggttttgccagtct 


2466 


2485 


I 5 


SEQ ID NO 


2793 


ctttggctataccaaag at 


2329 


2348SEQ 


ID NO 


4131 atctcgtgtctaggaaaag 


5976 


5995 


I 5 


SEQ ID NO 


2794 


caaagatgataaacatgag 


2341 


2360SEQ 


ID NO 


; 4132ctcaaQgataacgtgtttg 


12617 


12636 


1 5 


SEQ ID NO 


2795 


gatatggtaaatggaaiaa 




2382 SEQ 


ID NO 


41 33uatcttattaattatatG 


13087 


A ^ A # 

13106 


1 

1 5 


SEQ ID NO 


2796 


ggaataatgctcagtgttg 


2375 


2394SEQ 


ID NO 


41 34caacacttacttgaattcc 


10669 


10688 


I 5 


SEQ ID NO 


2797 


tttgaaatccaaagaagtc 


2410 


2429 SEQ 


ID NO 


4 1 35 gacttcagag aaatacaaa 


AAA 

11408 


A A An ^9 J 

11427 


I 5 


SEQ ID NO 


2798 


gatcccccagatgattg g a 


2542 


2561 SEQ 


ID NO 


4 1 36 tccaatttccctgtgg ate 


3689 


3708 


1 5 


SEQ ID NO 


2799 


cagatgattggagaggtca 


2549 


O CCD » 

2568 SEQ 


ID NO 


41 37tgaccacacaaacagtctg 


5371 


5390 


1 5 


SEQ ID NO 


2800 


ag aatg a cttttttcttca 


2583 


2602 SEQ 


ID NO 


A A ^1 ^\ ± — _ _ . _ — _ _ XJ. A.X A. 

41 38tgaagtccggattcattct 


1 1023 


1 1 042 


1 5 


SEQ ID NO 


2801 


g aa ctcccca ctgg ag ctg 


2627 


2646SEQ 


ID NO 


J A ^ ^\ i_ A. XX 

; 4139cagctcaacGgtacagttc 


11869 


11888 


I 5 


SEQ ID NO 


2802 


atatcttcatctggagtca 


ZddO 


2679SEQ 


ID NO 


41 40tgacttcagtgGagaatat 


A A A 

1 1974 


11993 


I 5 


SEQ ID NO 


2803 


gtcattgctcccggagcca 


2d7o 


2694SEQ 


ID NO 


4141 tggccccgtttaccatgac 


5817 


5836 


I 5 


SEQ ID NO 


2804 


gctgaagtttatcattcct 


2881 


2900 SEQ 


ID NO 


AAA P\ ^ x< A. ■ X 

4 1 42 aggaggctttaagttcagc 


7608 


7627 


I 5 


SEQ ID NO 


2805 


attccttccccaaagagac 


2894 


2913SEQ 


ID NO 


a ^ ^ a A i ■ a a a 

4 1 43 gtctcttcctccatggaat 


10478 


10497 " 


1 5 


SEQ ID NO 


2806 


ctcattgagaacaggcagt 


2984 


3003SEQ 


ID NO 


; 4 1 44 actgactgcacgctttgag 


11764 


11783 ' 


1 5 


SEQ ID NO 


2807 


ttgagcagtattctgtcag 


3150 


3169SEQ 


ID NO 


41 45 ctgagagaagtgtcttcaa 


12407 


12426 ' 


1 5 


SEQ ID NO 


2808 


accttgtccagtgaagtcc 


3293 


3312SEQ 


ID NO 


4 1 46 ggacggtactgtcccaggt 


12792 


12811 


1 5 


SEQ ID NO 


2809 


ccagtgaagtccaaattcc 


3300 


3319SEQ 


ID NO 


41 47 ggaaggcagagtttactgg 


9156 


9175 ' 


I 5 


SEQ ID NO 


2810 


acattcagaacaagaaaat 


3402 


3421 SEQ 


ID NO 


41 48 atttcctaaagctggatgt 


11175 


11194 ' 


I 5 


SEQ ID NO 


2811 


gaaaaatcaagggtgttat 


3471 


3490SEQ 


ID NO 


41 49 ataaactgcaagatttttc 


13608 


13627 • 


1 5 


SEQ ID NO 


2812 


aaatcaagggtgttatttc 


3474 


3493 SEQ 


ID NO 


. 4150gaaacaatgGattagattt 


9753 


9772 ' 


1 5 


SEQ ID NO 


: 2813 


tggcattatgatgaagaga 


3617 


3636SEQ 


ID NO 


; 4151 tctcccgtgtataatgcca 


11789 


11808 ' 


1 5 


SEQ ID NO 


2814 


aagagaagattgaatttga 


3630 


3649SEQ 


ID NO 


: 4152 tcaaaacctactgtctctt 


10466 


10485 ' 


I 5 


SEQ ID NO 


2815 


aaatgacttccaatttccc 


3681 


3700SEQ 


ID NO 


4 1 53 gggaactacaatttcattt 


7021 


7040 ' 


1 5 
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SEQ ID NO 


2816 


atgacttccaatttccctg 


3683 


3702 SEQ 


ID NO 


; 41 54 caggctgattacgagtcat 


4925 


4944 


\ 5 


SEQ ID NO 


; 2817 


acttccaatttccctgtg g 


3686 


3706 SEQ 


ID NO 


4155ccacgaaaaatatg9aagt 


10368 


10387 


1 5 


SEQ ID NO 


2818 


agttgcaatgagctcatgg 


3811 


3830SEQ 


ID NO 


; 4156ccatcagttcagataaact 


7997 


8016 


1 5 


SEQ ID NO 


: 2819 


tttgcaagaccacctcaat 


3868 


3887SEQ 


ID NO 


; 4157attgacctgtccattcaaa 


13679 


13698 


1 5 


SEQ ID NO 


2820 


gaaggagttcaacctccag 


3892 


391 1 SEQ 


ID NO 


; 4158 ctggaattgtcattccttc 


11736 


11755 • 


1 5 


SEQ ID NO 


: 2821 


acttccacatcccagaaaa 


3927 


3946 SEQ 


ID NO 


4159 ttttaacaaaagtggaagt 


6829 


6848 


1 5 


SEQ ID NO 


2822 


ctcttcttaaaaagcgatg 


3947 


3966 SEQ 


ID NO 


41 60 catcactgccaaaggagag 


8494 


8613 ' 


1 5 


SEQ ID NO 


: 2823 


aaaagcgatggccgggtca 


3956 


3976 SEQ 


ID NO 


4161 tgactcactcattgatttt 


12688 


12707 • 


1 5 


SEQ ID NO 


: 2S24 


ttccttigccttttggtgg 


4011 


4030 SEQ 


ID NO 


; 4162ccacaaacaatgaagggaa 


9264 


9283 


1 5 


SEQ ID NO 


: 2325 


caagtctgtg g g attccat 


4087 


41 06 SEQ 


ID NO 


; 4163atgggaaaaaacaggcttg 


9574 


9593 


1 5 


SEQ ID NO 


2826 


aagtccctactttlaccat 


4125 


4 144 SEQ 


ID NO 


4 1 64 atgggaagtataagaactt 


4842 


4861 


1 5 


SEQ ID NO 


2827 


tgcctctcctgggtgttct 


4167 


4186SEQ 


ID NO 


41 65 agaaaaacaaacacaggca 


9661 


9670 


1 5 


SEQ ID NO 


: 2828 


accagcacagaccatttca 


4250 


4269 SEQ 


ID NO 


; 4166tgaagtgtagtctcctggt 


5097 


5116 ' 


1 5 


SEQ ID NO 


2829 


ccagcacagaccatttcag 


4251 


4270 SEQ 


ID NO 


; 4167ctgaaatacaalgctctgg 


5519 




1 5 


SEQ ID NO 


• 2830 


actatcatgtgatgggtct 


4375 


4394 SEQ 


ID NO 


; 4168agacacctgattttatagt 


7956 


7975 ' 


1 5 


SEQ ID NO 


2831 


accacagatgtctgcttca 


4504 


4523 SEQ 


ID NO 


: 4169tgaaggctgactctgtggt 


4290 


4309 


I 5 


SEQ ID NO 


2832 


ccacagatgtctgcttcag 


4505 


4524 SEQ 


ID NO 


41 70 ctgagcaacaaatttgtgg 


10319 


1 0338 ■ 


1 5 


SEQ ID NO 


2833 


tttggactccaaaaagaaa 


4528 


4547 SEQ 


ID NO 


41 71 tttctctcatgattacaaa 


5941 


5960 ' 


1 5 


SEQ ID NO 


2834 


tcaaagaagtcaagattga 


4560 


4579 SEQ 


ID NO 


41 72tcaaggataacgtgtttga 


12618 


12637 ' 


1 5 


SEQ ID NO 


2835 


atgagaactacgagctgac 


4806 


4825SEQ 


ID NO 


4 1 73 gtcagatattgttgctcat 


10195 


10214 ' 


1 5 


SEQ ID NO 


233S 


ttaaaatctgacaccaatg 


4826 


4845 SEQ 


ID NO 


4 1 74 cattcattgaagatgttaa 


7350 


7369 ' 


1 5 


SEQ ID NO 


2837 


gaagtataagaactttgcc 


4846 


4865 SEQ 


ID NO 


41 75ggcaaatttgaaggacttc 


12002 


12021 ' 


1 5 


SEQ ID NO 


2838 


aagtataagaactttgcca 


4847 


4866 SEQ 


ID NO 


41 76tggcaaatttgaaggactt 


12001 


12020 ' 


1 5 


SEQ ID NO 


2839 


ttcttcagcctgctttctg 


4949 


4968 SEQ 


ID NO 


41 77cagaatccagatacaagaa 


6892 


3911 ' 


I 5 


SEQ ID NO 


2840 


ctggatcactaaattccca 


4965 


4984 SEQ 


ID NO 


41 78tgggtctttccagagccag 


11041 


11060 ' 


1 5 


SEQ ID NO 


2341 


aaattaatagtggtgctca 


5022 


5041 SEQ 


ID NO 


41 79 tgagaagccccaagaattt 


6256 


5275 ' 


1 5 


SEQ ID NO 


2842 


agtgcaacgaccaacttga 


6081 


51 00 SEQ 


ID NO 


41 SOtcaaattcctggatacact 


9856 


9875 - 


1 5 


SEQ ID NO 


2843 


ctgggaagtgcttatcagg 


5246 


5265 SEQ 


ID NO 


4181 cctgaccttcacataccag 


8318 


8337 ' 


I 5 


SEQ ID NO 


2844 


gcaaaaacattttcaactt 


5286 


5305 SEQ 


ID NO 


41 82 aagtaaaagaaaattttgc 


10752 


10771 - 


I 5 


SEQ ID NO. 


2845 


aaaaacattttcaacttca 


5288 


5307SEQ 


ID NO 


41 83tgaagtaaaagaaaatttt 


10760 


1 0769 ' 


I 5 


SEQ ID NO 


2846 


tcagtcaagaaggacttaa 


5310 


5329 SEQ 


ID NO 


41 84 ttaaggacttccattctga 


13371 


1 3390 - 


1 5 


SEQ ID NO 


2847 


tcaaatgacatgatgggct 


5333 


5352SEQ 


ID NO 


41 85 agcccatcaatatcattga 


6213 


5232 ' 


1 5 


SEQ ID NO: 


2843 


cacacaaacagtctgaaca 


5375 


5394 SEQ 


ID NO 


41 86 tgtttcaactgcctttgtg 


11227 


1 1 246 ' 


i 5 


SEQ ID NO: 


2849 


tcttcaaaacttgacaaca 


5417 


5436 SEQ 


ID NO 


41 87tgttttcctatttccaaga 


12843 


12862 ' 


1 5 


SEQ ID NO: 


2850 


caagttttataagcaaact 


5449 


6468 SEQ 


ID NO 


41 88 agttattttgctaaacttg 


14051 


14070 - 


1 5 


SEQ ID NO: 


2851 


tggtaactactttaaacag 


5496 


5515SEQ 


ID NO 


41 89 ctgtltttagaggaaacca 


7520 


7539 - 


! 5 


SEQ ID NO: 


2852 


aacagtgacctgaaataca 


5510 


5529 SEQ 


ID NO- 


41 90tgtatagcaaattcctgtt 


5898 


5917 1 


1 5 


SEQ ID NO: 


2353 


gggaaactacggctagaac 


5552 


5571 SEQ 


ID NO: 


4191 gttccttccatgatttccc 


10941 


1 0960 1 


1 5 


SEQ ID NO: 


2354 


aacacatctatgccatctc 


5628 


5647 SEQ 


ID NO: 


41 92 gagacagcatcttogtgtt 


11212 


11231 1 


1 5 


SEQ ID NO; 


2855 


tcagcaagctataaagcag 


5660 


5679 SEQ 


ID NO: 


41 93 ctgctaagaaccttactga 


7788 


7807 1 


1 5 


SEQ ID NO; 


2856 


gcagacactgttgctaagg 


5675 


5694 SEQ 


ID NO: 


41 94 cctttcaagcacfgactgc 


11754 


11773 1 


1 5 


SEQ ID NO: 


2857 


tctggggagaacatacigg 


5874 


5893 SEQ 


ID NO: 


41 95 ccaggttttccacaccaga 


8046 


8065 1 


1 5 


SEQ ID NO: 


2858 


ttctctcatgattacaaag 


5942 


5961 SEQ 


ID NO; 


4 1 96 ctttttcacca a eg g a ga a 


10846 


1 0865 1 


5 


SEQ ID NO: 


2859 


ctgagcagacaggcacctg 


6042 


6061 SEQ 


ID NO: 


41 97caggaggctttaagttcag 


7607 


7626 1 


5 


SEQ ID NO: 


2860 


caatttaacaacaatgaat 


6074 


6093 SEQ 


ID NO: 


41 98 attccttcctttacaattg 


8090 


3109 1 


6 


SEQ ID NO: 


2861 


tggacgaactctggctgac 


6148 


6167SEQ 


ID NO: 


41 99 gtcagcccagttcctlcca 


10932 


1 0951 1 


5 


SEQ ID NO: 


2862 


cttttactcagtgagccca 


6200 


621 9 SEQ 


ID NO: 


4200 tgggctaaacgtatgaaag 


7835 


7854 1 


1 5 


SEQ ID NO: 


2863 


tcattgatgctttagagat 


6225 


6244 SEQ 


ID NO: 


4201 atcttcataagttcaatga 


13182 


13201 1 


1 5 


SEQ ID NO; 


2864 


aaaaccaagatgttcactc 


6303 


6322 SEQ 


ID NO: 


4202 gagtgaaatgctgtttttt 


86o8 


3657 ' 


5 
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o C- w 


in NO' 




aggaatcgacaaaccatta 


6365 


6384 QPo 


in MO- 

lU INU. 


4203taataattttcaaattcct 


8302 


8321 


1 5 




in NO- 
IL' iNW, 


^ooo 


taqttqtactQQaaaacat 


6384 


6403 <5Po 


in MO' 
lU NU. 


4204 acattaGcctctaaaacta 


1 1936 

1 1 W W \J 


11955 

1 1 www 


1 5 




in NO- 
IL^ INWa 




ggaaaacqtacagagaaag 


6394 


6413Qpn 


in t\\r\' 

lU NU. 


4205 cttttacaattcatlttcG 


13022 


13041 


1 5 




lU inu; 




gaaaacgtacagagaaagc 


6395 


6414Qcr. 


ID NO: 


4206 gctttctcttccacatttc 


10060 


10079 


1 5 


otU 


lU IMU! 


2869 




8409 


fi42fl c?cr-\ 


ID NO 




AQQ9 


1 U 1 1 


1 ^ 




irv MO. 
\U NU. 


Zo7\J 


aanntnssocanateaata 


6410 


R42QC5CO 


ID NO 




RQQI 

1 


701 0 




ctzr\ 


irk MO< 




^ wC3 wl Ci&ILt^C^ 


8414 




ID NO 


49nQtraarH1a3tn:5ttttra 




R'^-14 




oLU 


Ir^ MO- 


2o/2 


atnriptr^tto^tcpir^lii'ci 


S422 

tip''™ ^ ^ 


<t441 c^i=f-\ 


ID NO: 






1 00 / 


I ^ 




iri Mp\ 
lU IMLJ 




taatoattatctoaattca 


8484 




ID NO; 


491 1 tnaastratinaanaatta 




R74R ' 


I ^ 


obU 


lU iMU 


2874 


natfatrH'naattnaftpa 


8<^.88 


8*507 c*cro 


ID NO: 


9fnsisnfsinf^{'n£insi9£i9t(^ 


71 09 


7i9l 


1 

1 w 


otU 


ID NU 


2875 


aa'ttnn n a n ?3 ciacaf^nttt 


6506 




ID NO 


1 dClOll./ClLlLA^lllddwdClll 


Q4C)ft 


93 1 9 


1 

1 


o tvj 


in MO 


-do /o 


aaaataa ctattactaaf a 


5701 


6720 onro 


ID NO 


42 1 4 tatta 99 a atatta atf It 


6814 






obU 


ID NO 


2877 


a S| a saf fa a a a ssntpHnat 






ID NO 




R7Rfi 
u 1 






obQ 


I r\ MO 

ID NU 


2878 


ttaaaaatattn a tttla a 


S81S 




ID NO: 


H-<^ 1 LLsle!lL>lLLro Lady LllrfdCl 


1 0 1 f t? 


1 0 1 t?o 


1 0 






/» 


aa acatccaaca cctaa ct 


6946 




ID NO 








1 ^ 




lU NU 


2880 




7029 




ID NO 


49 1 R attnrttrrftiar^aattn 






1 3 




ir\ MO 
lU NU 


2oo 1 


a aatttta ata a ataaatt 




7201 c?i=o 
' ^'J I SEQ 


ID NO 


49 "1 Q aatlfitfn^iaanciaciai^fH" 


1*^1^R 


1 0 1 ^ *r 


1 U 


obU 


ir\ MO 
ID NU 


2oo2 




7241 




ID NO 


499n nnar'aannr^rr'anaatr'fn 


1 ^'□UO 


19^79 




o t o 


lU NU 




taaaataaaaaatfacttt 


7270 




ID NO 


4991 ?4?i?in3335ir'fH'atnr'r*f ta 
1 ciciciyciaciou^v.>iciLLjL>oiid 


\ 0 1 ^^o 


1'^-1R9 
1 0 1 0^ 




obLi 


ir\ Kio 
lU NU 


2884 


aaaa attactttaaoaaat 


7277 




ID NO 


49 9 9 atttrttfl n annttrrttt 

*T£.^£. dLLL^ilClddV/dlLL/LiLLl 


Q4ftQ 






obU 


IP NU 


2885 


aaa aa attaattfi aattt'a 


7289 




ID NO 




1 9Q7n 
1^9/ U 


i9QftQ 


1 


obU 


lU NU 


288c 


atttattaataatoctatc 


7303 


7322 eco 


ID NO 


4 9 94 r^al'ntl'n 9 ta a a n a a a t 
*T^^*T yclwcll^ LL^cllcieldyddcl I 










IPk KIO 

lU NU 


2oo/ 


0 aattatcttttaaaacat 


7334 




ID NO 


499^ atn+atf*sis*9trT na r^nttf* 

^£>£>«J LCllwClClCll.vJ y ClU>dlLw 




7704 


1 *^ 


obU 


ir\ KIO 
lU NU 




ttancaccaatttatanaf 


7411 




ID NO 






107'=;r 


1 

1 0 


obU 


lr\ MO 
lU NU 


2oGs:i 


ttacaatatatcta a aaaa 


7548 


7567ctro 


ID NO 


4 9 9 7 f^ttttna tta n a tn pa a 


8490 




1 0 


obU 


in MO 
lU NU 


28aU 


cattcaacaaa aacttcaa 


7699 


771 Sccro 


ir^ K 10 
ID NU 


4 9 7 R ttfi aannacttpa n fi a a t n 


1 90nQ 


19n9R 


1 <S 
1 0 


obLi 


IT\ MO 
lU NU 


20^1 


acacctaattttataatcc 


7958 


7977GCO 


10 KIO 

ID NU 


499 9 aaantoaa ooatfl aootflt 


19fi14 

1 1 *T 


1 9(533 


1 R 
1 0 


obvtc 


in MPi 

lU INU 


OOOO 


□aattccatcaattcaaat 


7992 


801 1 ci^o 


\u NU 


42 30 atcttca ataattatatcc 


13124 
1^1 


13143 

1^1 *Tv 


1 5 

1 w 


ceo 


lU NU 




ttataaaaataaaaataaa 


B112 


8131 ceo 


in N 

lU NU 


4931 tttatoaHatntca ana a 




1237Q 

I^V r x7 


1 5 


obLj 


in MO 
lU NU 


2oy4 


ctaaacaataaactacaat 


8156 


8175oi=o 


m ^ 10 
ID NU 


4232actaaacttctctaatcaa 


8309 


8828 






in MO 




aatccaatctcctcttttc 


8407 


842Scco 


m MO 
lU NU 


4233Qaaaaataaaatccaaatt 


1 1 017 


1 1 03S 


1 5 


ObU 


in MO 
\U NU 


2aaO 


atttta attttcaacicaaa 


8532 


8551 cno 


ID NU 


4234tttaeaaattaaaaaaaat 


14023 


14042 


1 S 


obU 


in MO 

lU INU 




tttta attttcaaa caaat 


8533 


8552cirrk 


ID NO 


4235attf'aafttaaatataaaa 

~b wv/ dk iLy d iLiddy Ly Lcicicici 


9622 


9641 


1 '5 


ObU 


in MPi 

lU INU 


OQOQ 


taattttcaaacaaataca 


8536 


8555c; rro 


lU NU 


4236 ta caaattaaaaaaaatea 

nr^MWw hMwddy LLddctyddddLwd 


14025 


14044 


1 \J 


ceo 


1 n MO 
lU NU 


oono 
zoyy 


ata eta tttttt a a a a ata 


8645 


8664 0 CO 


ID NO 


4937r'attnnfannan3p?>nr*at 


1 190"^ 
1 1 ^uo 


1 1 999 


1 

1 O 


obU 


in KIO 
lU NU 


2yuu 


t a eta tttttta a a a atac 


8646 




ID NO 


49 3R npattnntsinnsinapanpa 
*T^tJu ^v,/dii^^l,d^M d^d^^ctui^d 


1 1 9D9 
1 1 


11991 

1 1 1 


1 0 


ceo 


m KIO 
ID NU 


2aU1 


aaaaaaatacacta naacf 


8706 




ID NU 


49 Q an nf a c\^nc\n ppfpttttt 


108'^'^ 


10R'=i9 


1 0 


ObU 


m MO 
lU NU 


2aU2 


a eta □ aa cttaata ata a a 


8716 




ID NO 


494n tcpapfpsipfttppf ^v«rif 


1 9ftQ 

1 &>(J0 


1 30R 


1 ^ 


obU 


in MO 
lU NU 


2sUo 


cttctaaaaaaaaatcata 

Vpfiiwiyy ddddyyy iwdiy 


8886 


8Qn*Sor=o 


ID NO 


1 v.^iyddvi./LiUriidwdiyddy 


1 '^7'^Q 
1 0 r ^JS7 


1 377R 


1 


SEQ 


ID NO 


2904 


ggaaaagggtcatggaaat 


8891 


8910SEQ 


ID NO 


4242 atttgaaagitcgttttcc 


9282 


9301 ' 


1 5 


SEQ 


ID NO 


2905 


gggcctgccccagattctc 


8910 


8929SEQ 


ID NO 


4243gagaacattatggaggccc 


9440 


9459 


1 5 


SEQ 


ID NO 


2906 


ttctcagatgagggaacac 


3924 


8943SEQ 


ID NO 


4244gtgtcttcaaagctgagaa 


12416 


12435 ' 


1 5 


SEQ 


ID NO 


2907 


gatgagggaacacatgaat 


8930 


8949SEQ 


ID NO 


4245 attccagcttccccacatc 


8338 


8357 ' 


1 6 


SEQ 


ID NO 


2908 


ctttggactgtccaataag 


8986 


9005SEQ 


ID NO 


4246 cttatgggatttcctaaag 


11167 


11186 ' 


1 5 


SEQ 


ID NO 


2909 


gcatccacaaacaatgaag 


9260 


9279SEQ 


ID NO 


4247 cttcatctgtcattgatgc 


10227 


10246 


1 5 


SEQ 


ID NO 


2910 


cacaaacaatgaagggaat 


9265 


9284SEQ 


ID NO 


4248 attccctgaagttg atgtg 


11488 


11507 • 


1 6 


SEQ 


ID NO 


: 2911 


ccaaaatttctctgctg g a 


9415 


9434SEQ 


ID NO 


4249 tccatcacaaatcctttg g 


9671 


9690 - 


1 5 


SEQ 


ID NO 


: 2912 


caaaatttctctoctg g aa 


9416 


9435SEQ 


ID NO 


; 4250tlccatcacaaatcctttg 


9670 


9689 


1 5 


SEQ 


ID NO 


: 2913 


tctgctggaaacsacgaga 


9425 


9444SEQ 


ID NO 


4251 tctcaagagttacagcaga 


13229 


13248 ' 


1 5 
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SEQ ID NO 


2914 


ctgctggaaacaacgagaa 


9426 


9445 SEQ 


ID NO 


4252 ttctcaagagttacagcag 


13228 


13247 


1 


5 


SEQ ID NO 


2915 


agaacattatggaggccca 


9441 


9460 SEQ 


ID NO 


4253 tgggcctgccccagattct 


8909 


8928 


1 


5 


SEQ ID NO 


2916 


ag aagcaaatctg gatttc 


9475 


9494 SEQ 


ID NO 


4254 g aaatcttcaatttattct 


13821 


13840 


1 


5 


SEQ ID NO 


2917 


tttctctctatgggaaaaa 


9565 


9584SEQ 


ID NO 


4255tttttgcaagttaaagaaa 


14021 


14040 


1 


5 


SEQ ID NO 


2918 


tcagagcatcaaatccttt 


9712 


9731 SEQ 


ID NO 


: 4256aaagaaaatcaggatctga 


14033 


14052 


1 


5 


SEQ ID NO 


2919 


cagaaacaatgcattagat 


9751 


9770SEQ 


ID NO 


4257 atctaigccatctcttctg 


5633 


5652 


1 


5 


SEQ ID NO 


2920 


tacacattaatcctgccat 


10001 10020SEQ 


ID NO 


4258 atggagtctttattgtgta 


14089 


14108 


1 


5 


SEQ ID NO 


2921 


agtcagatattgttgctca 


10194 10213SEQ 


ID NO 


4259tgagaactacgagctgact 


4807 


4826 


1 


5 


SEQ ID NO 


2922 


ggagggtagtcataacagt 


10336 10355 


ID NO 


; 4260 actggtggcaaaaccctcc 


2734 


2753 


1 


5 


SEQ ID NO 


2923 


ca a a ag ccg aaattccaat 


10404 10423 SEQ 


ID NO 


4261 attgaagtacctacttttg 


8366 


8385 


1 


5 


SEQ ID NO 


2924 


aaaagccgaaattccaatt 


10405 10424 SEQ 


ID NO 


: 4262aattgaagtacclactttt 


8365 


8384 


1 


5 


SEQ ID NO 


2925 


ttcaagcaagaacttaatg 


10436 10455 SEQ 


ID NO 


4263 cattatggcccttcgtgaa 


13258 


13277 


1 


5 


SEQ ID NO 


292B 


cctcttacttttccattga 


10578 10597 SEQ 


ID NO 


; 4264tcaaaagaagcccaagagg 


12947 


12966 


1 


5 


SEQ ID NO 


2927 


tgaggccaacacttacttg 


10663 10682 SEQ 


ID NO 


4265 caagcatctgattgactca 


12676 


12695 


1 


5 


SEQ ID NO 


2928 


cacttacttgaattccaag 


10672 10691 SEQ 


ID NO 


4266cttgaacacaaagtcagtg 


6008 


6027 


1 


5 


SEQ ID NO 


2929 


gaagtaaaagaaaattttg 


10751 10770SEQ 


ID NO 


4267caaaaacattttcaacttc 


5287 


5306 


1 


5 


SEQ ID NO 


293D 


cctggaactctctccatgg 


10882 10901 SEQ 


ID NO 


; 4268 ccatttacagatcttcagg 


11372 


11391 


1 


5 


SEQ ID NO 


2931 


agctggatgtaaccaccag 


1118411203SEQ 


ID NO 


4269 ctggattccacatgcagct 


11855 


11874 


1 


5 


SEQ !D NO 


2932 


aaaattccctg aag ttg at 


11 485 11 504 SEQ 


ID NO 


4270 atcatatccgtgtaatttt 


6765 


6784 


1 


5 


SEQ ID NO 


2933 


cagatggcattgctgcttl 


1161311632SEQ 


ID NO. 


4271 aaagctgagaagaaatctg 


12424 


12443 


1 


5 


SEQ ID NO 


2934 


agatggcattgctgctttg 


11614 11633SEQ 


ID NO 


4272 caaagctgagaagaaatct 


12423 


12442 


1 


5 


SEQ ID NO 


2935 


tgttg aaacagtcctg g at 


1184211861 SEQ 


ID NO 


4273atccaagatgagatcaaca 


13103 


13122 


1 


5 


SEQ ID NO 


2936 


catattcaaaactgagttg 


12229 12248 SEQ 


ID NO. 


4274 caactctctgattactatg 


13631 


13650 


1 


5 


SEQ ID NO 


2937 


aa ag atttatca aaagaag 


12938 12957SEQ 


ID NO. 


4275 cttcaatttattcttcttt 


13826 


13845 


1 


5 


SEQ ID NO 


293B 


attttccaactaatagaag 


13034 13053 SEQ 


ID NO. 


4276cttcaaagacttaaaaaat 


8014 


8033 


1 


5 


SEQ ID NO 


2939 


aattatatccaag atgag a 


13097 13116SEQ 


ID NO; 


4277 tctctlcctccatgga att 


10479 


10498 


1 


5 


SEQ ID NO 


2940 


ttcaggaagcttctcaaga 


13218 13237 SEQ 


ID NO 


4278tcttcataagttcaatgaa 


13183 


13202 


1 


5 


SEQ ID NO 


2941 


ttgagcaatttctgcacag 


13437 13456 SEQ 


ID NO: 


4279 ctgttgaaagatttatcaa 


12932 


12951 


1 


5 


SEQ ID NO 


2942 


ctgatatacatcacggagt 


13712 13731 SEQ 


ID NO: 


4280 actcaatggtgaaattcag 


7465 


7484 


1 


5 


SEQ ID NO: 


2943 


acatcacggagttactgaa 


13719 13738 SEQ 


ID NO: 


4281 ttcagaagctaagcaatgt 


7239 


7258 


1 


5 


SEQ ID NO: 


2944 


actgcctatattgataaaa 


13882 13901 SEQ 


ID NO: 


4282ttttggcaagctatacagt 


8380 


8399 


1 


5 


SEQ ID NO: 


2945 


agg atggcattttttgcaa 


14011 14030 SEQ 


ID NO: 


4283ttgcaagcaagtctttcct 


3013 


3032 


1 


5 


SEQ ID NO 


2946 


ttttttgcaagttaaagaa 


14020 14039SEQ 


ID NO' 


4284ttctctctatgggaaaaaa 


9566 


9585 


1 


5 


SEQ ID NO 


2947 


tccagaactcaagtcttca 


1627 


1646 SEQ 


ID NO: 


4285tgaaatgctgttttttgga 


8641 


8660 


3 


4 


SEQ ID NO: 


2948 


agttagtgaaagaagttct 


1956 


1975 SEQ 


ID NO: 


4286agaatctgtaccaggaact 


12564 


12583 


3 


4 


SEQ ID NO: 


2949 


atttacagctctgacaagt 


5435 


5454 SEQ 


ID NO: 


4287 acttcagagaaatacaaat 


11409 


11428 


3 


4 


SEQ ID NO: 


2950 


gattatctgaattcattca 


6488 


6507 SEQ 


ID NO: 


4288 tgaaaccaatgacaaaatc 


7429 


7448 


3 


4 


SEQ ID NO: 


2951 


gtgcccttctcggttgctg 


26 


45 SEQ 


ID NO: 


4289cagctgagcagacaggcac 


6039 


6058 


2 


4 


SEQ ID NO 


2952 


attcaagcacctccggaag 


253 


272 SEQ 


ID NO; 


4290 cttcataagttcaatgaat 


13184 


13203 


2 


4 


SEQ ID NO- 


2953 


gactgctgattcaagaagt 


316 


335 SEQ 


ID NO: 


4291 acttcccaactctcaagtc 


13415 


13434 


2 


4 


SEQ ID NO: 


2954 


ttg ctg cag ccatgtccag 


483 


502 SEQ 


ID NO: 


4292 ctgggcagctgtatagcaa 


5889 


5908 


2 


4 


SEQ ID NO: 


2955 


agaaagatgaacctactta 


655 


574 SEQ 


ID NO: 


4293taagtatgatttcaattct 


10498 


10517 


2 


4 


SEQ ID NO: 


2956 


tgaagactctocaggaact 


1095 


1114SEQ 


ID NO: 


4294 agttcaatgaatttattca 


13191 


13210 


2 


4 


SEQ ID NO: 


2957 


atctctcttgccacagctg 


1210 


1229 SEQ 


ID NO: 


4295 cagcccagccatttgagat 


9237 


9256 


2 


4 


SEQ ID NO: 


2953 


tctctcttgccacag ctg a 


1211 


1230SEQ 


ID NO: 


4296 tcagcccagccatttgaga 


9236 


9255 


2 


4 


SEQ ID NO: 


2959 


tgaggtgtccagccccatc 


1231 


1250SEQ 


ID NO; 


4297 gatgggaaagcxjgccctca 


5216 


5235 


2 


4 


SEQ ID NO: 


2960 


ccagaactcaagtcttcaa 


1628 


1647 SEQ 


ID NO: 


4298 ttgaaagcagaacctctgg 


5915 


5934 


2 


4 


SEQ ID NO: 


2961 


ctgaaaaagttagtgaaag 


1949 


1968 SEQ 


ID NO: 


4299 ctttctcgggaatattcag 


10631 


10650 


2 


4 


SEQ ID NO: 


2962 


tttttcccagacagtgtca 


2246 


2265 SEQ 


ID NO: 


4300 tgacaggcattttgaaaaa 


9730 


9749 


2 


4 


SEQ ID NO: 


2963 


ttttcccagacagtgtcaa 


2247 


2266 SEQ 


ID NO: 


4301 ttgacaggcattttgaaaa 


9729 


9748 


2 


4 
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SEQ ID NO: 


2964 


cattcagaacaagaaaatt 


3403 


3422SEQ ID NO: 


4302 aattccaattttgagaatg 


10414 


10433 


2 


4 


SEQ ID NO: 


2965 


tgaagagaagattgaattt 


3628 


3647SEQ ID NO: 


4303 aaatgtcagctcttgttca 


10902 


10921 


2 


4 


SEQ ID NO: 


2966 


tttgaatggaacacaggca 


3644 


3663SEQ ID NO: 


4304tgccagtttgaaaaacaaa 


11815 


11834 


2 


4 


SEQ ID NO: 


2967 


ttctag attcgaatatcaa 


4407 


4426 SEQ 


ID NO: 


4305ttgacatgttgataaagaa 


7377 


7396 


2 


4 


SEQ ID NO. 


2968 


gattcgaatatcaaattca 


4412 


4431 SEQ 


ID NO: 


4306tgaagtagaccaacaaatc 


7162 


7181 


2 


4 


SEQ ID NO 


2969 


tgcaacgaccaacttgaag 


^5 ^5 8 3 


5102SEQ 


ID NO: 


4307cttcaggttccatcgtgca 


11384 


11403 


2 


4 


SEQ ID NO; 


2970 


ttaagctctcaaatgacat 




5344 SEQ 


ID NO: 


4308 atgttgataaagaaattaa 


7382 


7401 


2 


4 


SEQ ID NO: 


2971 




6074 


6093 SEQ 


ID NO: 


^ou^ancaaacigcciaTang 


loo rO 


1 OOeJO 


z 




SEQ ID NO: 


2972 


tgaatacagccaggacttg 


6088 


6107SEQ 


ID NO: 


431 0 caagagcacacggtcttca 


10687 


10706 


2 


4 


SEQ ID NO 


2973 


catcaatattgatcaattt 


6421 


6440 SEQ 


ID NO: 


431 1 aaattccctgaagttgatg 


11486 


11505 


2 


4 


SEQ ID NO, 


2974 


ttgagcatgtcaaacactt 


7059 


7078 SEQ 


ID NO: 


431 2 aagtaagtgctaggttcaa 


9381 


9400 


2 


4 


SEQ ID NO: 


2975 


tgaaggagactattcagaa 


7227 


7246 SEQ 


ID NO: 


431 3ttctgcacagaaatattca 


13446 


13465 


2 


4 


SEQ ID NO: 


2976 


ttcaggctcttcagaaagc 


7929 


7948 SEQ 


ID NO: 


43 1 4 gcttgclaacctctctgaa 


12312 


12331 


2 


4 


SEQ ID NO: 


2977 


fccacaaattgaacatccc 


8787 


6806SEQ 


ID NO: 


431 5gggacc1accaagagtgga 


12533 


12552 


2 


4 


SEQ ID NO: 


2978 


tgaataccaatgctg a act 


101671 01 BSsEQ 


ID NO: 


431 6 agttcaatgaatttattca 


13191 


13210 


2 


4 


SEQ ID NO. 


2979 


taaactaatag atgtaatc 


12898 1291 7 SEQ 


ID NO: 


431 7 gattactatgaaaaattta 


13640 


13659 


2 


4 


SEQ ID NO: 


2980 


ttgacctgtccattcaaaa 


13680 13699 SEQ 


ID NO: 


431 Sttttaaaagaaatcttcaa 


13813 


13832 


2 


4 


SEQ ID NO' 


2981 


gggctgagtgcccttctcg 


19 


^'SEQ 


ID NO: 


431 9cgaggccaggccgcagccc 


84 


103 


1 


4 


SEQ ID NO. 


. 2982 


ggctgagtgcccttctcgg 


20 


SEQ 


ID NO: 


4320 ccqaaqccaqqccqcaqcc 


83 


102 




4 


SEQ ID NO" 


2983 




22 


^'SEQ 


ID NO; 


4oiii aaccgigcctgaatctcag 


1 1 oo7 


1 lo/o 




>* 
4 


SEQ ID NO: 


2964 






^'^SEQ 


ID NO 


**oc.^ icagcigaccicaicgaga 


£. loo 






4 


SEQ ID NO: 


2985 


caggccgcagcccaggagc 


QO 


^^^SEQ 


ID NO 


40^0 gcicigcagcttcatccig 


37o 






4 


SEQ ID NO 


2986 


gctggcgctgcctgcgctg 


151 


170SEQ 


ID NO 


4324cagcacagaccatttcagc 


4252 


4271 




4 


SEQ ID NO 


2987 


J_ . JL 1. 

tgctgctggcgggcgccag 


177 


196SEQ 


ID NO. 


4325ctggatgtaaccaccagca 


11186 


11205 




4 


SEQ ID NO. 


2988 


ctggtctgtccaaaagatg 


227 


246 SEQ 


ID NO; 


4326 catcctgaagacx^agccag 


388 


407 




4 


SEQ ID NO- 


2989 


ctgagagttccagtggagt 


291 


310 SEQ 


ID NO- 


4327 actcaccctggacaticag 


3391 


3410 


1 


4 


SEQ ID NO 


2990 


tccagtggagtccctggga 


299 


31 8 SEQ 


ID NO- 


4328tcccggagccaaggctgga 


2683 


2702 


1 


4 


SEQ ID NO 


2991 


aggttgagctggaggttcc 


354 


373 SEQ 


[D NO- 


4329 ggaaccctctccctcacd 


4736 


4755 




4 


SEQ ID NO 


2992 


tgagctggag gttccccag 


358 


377 SEQ 


ID NO 


4330 ctgggaggcatgatgctca 


9171 


9190 


1 


4 


SEQ ID NO 


2993 


tctgcagcltcatcctgaa 


378 


397 SEQ 


ID NO. 


4331 ttcaaatataatcggcaga 


3269 


3288 


1 


4 


SEQ ID NO. 


2994 


gccagtgcaccctgaaaga 


402 


421 SEQ 


ID NO' 


4332tcttccgttctgtaatggc 


5802 


5821 


1 


4 


SEQ ID NO 


2995 


ctctgaggagtttgctgca 


472 


491 SEQ 


ID NO' 


4333tgcaagaatattttgagag 


6348 


6367 


1 


4 


SEQ ID NO: 


2996 


aggtatgagctcaagctgg 


500 


51 9 SEQ 


ID NO: 


4334 ccagttlccggggaaacct 


12724 


12743 


1 


4 


SEQ ID NO 


2997 


tcctttacccgg agaaaga 


543 


562 SEQ 


ID NO: 


4335tctttttgggaagcaagga 


2227 


2246 


1 


4 


SEQ ID NO: 


2998 


catcaagaggggcatcatt 


583 


602SEQ 


ID NO. 


4336 aatggtcaagttcctgatg 


2285 


2304 


1 


4 


SEQ ID NO. 


2999 


tcctggttcccccagagac 


609 


628 SEQ 


ID NO 


; 4337gtctctgaactcagaagga 


13996 


14015 


1 


4 


SEQ ID NO: 


3000 


aagaagccaagcaagtgtt 


630 


649sEQ 


ID NO 


4338 aacaaataaatggagtctt 


14080 


14099 


1 


4 


SEQ ID NO; 


3001 


aagcaagtgttgtttctgg 


638 


657SEQ 


ID NO 


4339 ccagagccaggtcgagctt 


11050 


11069 




4 


SEQ ID NO: 


3002 


tctggataccgtgtatgga 


652 


671 SEQ 


ID NO 


4340 tccatgtcccatttacaga 


11364 


11383 




4 


SEQ ID NO 


30C3 


ccactcactttaccgtcaa 


678 


697SEQ 


ID NO 


4341 ttgattttaacaaaagtgg 


6825 


6844 




4 


SEQ ID NO: 


3004 


aggaagggcaatgtggcaa 


701 


720SEQ 


ID NO 


4342ttgcaagcaagtctttcct 


3013 


3032 




4 


SEQ ID NO. 


3005 


gcaatgtggcaacagaaat 


708 


727SEQ 


ID NO 


4343 atttccataccccgttlgc 


3488 


3507 




4 


SEQ ID NO; 


3006 


caatgtggcaacagaaata 


709 


728 SEQ 


ID NO 


4344tattcttcttttccaattg 


13834 


13853 




4 


SEQ ID NO 


3007 


tggcaacagaaatatccac 


714 


733SEQ 


ID NO 


4345gtggcttccca1attgcca 


1895 


1914 




4 


SEQ ID NO; 


3008 


agagacctgggccagtgtg 


737 


7S6sEQ 


ID NO 


4346 cacattacatttggtctct 


2938 


2957 




4 


SEQ ID NO 


3009 


Ig tgatcgcttcaagccca 


752 


771 SEQ 


ID NO 


; 4349tgggaaagccgccctcaca 


5218 


5237 




4 


SEQ ID NO 


■ 3010 


gtgatcgcttcaagcccat 


753 


772 SEQ 


ID NO 


; 435Qatgggaaagccgccctcac 


5217 


5236 




4 


SEQ ID NO 


• 3011 


cagcccacttgctctcatc 


784 


803 SEQ 


ID NO 


; 4351 gatgctgaacagtgagctg 


8152 


8171 




4 


SEQ ID NO 


3012 


gctctcatcaaaggcatge 


794 


813 SEQ 


ID NO 


; 4352tcataacagtactgtgagc 


10345 


10364 




4 
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SEQ ID NO: 

SEQ ID NO: 
SEQ iD NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



3013 

3014 

3015 

3016 

3017 

3018 

3019 

3020 

3021 

3022 

3023 

3024 

3025 

3026 

3027 

3028 

3029 

3030 

3031 

3032 

3033 

3034 

3035 

3036 

3037 

3038 

3039 

3040 

3041 

3042 

3043 

3044 

3045 

3046 

3047 

3048 

3049 

3050 

3051 

3052 

3053 

3054 

3055 

3056 

3057 

305B 

3059 

3060 

3061 

3062 



ccttgtcaactctgatcag 

cttgtcaactctgatcagc 

agccatctgcaaggagcaa 

gccatctgcaaggagcaac 

cttcctgcctttctcctac 

ctttctcclacaag aataa 

gatcaacagccgcttcttt 

atcaacagccgcttctttg 

acagccgcttctttggtga 

aagatgggcctcgcatttg 

tgttttgaagactctccag 

ttgaagactctccaggaac 

aactgaaaaaactaaccat 

ctgaaaaaactaaccatct 

aaaactaaccatctctg ag 

tgagcaaaatatccagaga 

caataagctggttactgag 

tactgagctgagaggcctc 

gcctcagtgatgaagcagt 

agtcacatctctcttgcca 

atctctcttgccacagctg 

tctctcttgccacag ctg a 

tgccacagctgattgaggt 

gccacagctgattg ag gtg 

tcactttacaagccttggt 

cccttctgatagatgtggt 

gtcacclacctggtggccc 

ccttgtatgcgctgagcca 

gacaaaccctacagggacc 

tgctaattacctgatggaa 

tgactgcactggggatgaa 

actgcactggggatgaaga 

atg aag attacacctattt 

accatg gag cagttaactc 

gcagttaactccagaactc 

cagaactcaagtcttcaat 

caggctctgcggaaaatgg 

ccagg ag gttcttcttcag 

ggttcttcttcag actttc 

tttccttgatgatgcttct 

ggagataagcgactggctg 

gctgcctatcttatgttga 

actttgtggcHcccatat 

gccaatatcttgaactcag 

aatatcttgaactcagaag 

ctcagaagaattggatatc 

aagaattggatatccaaga 

agaattggatatccaagat 

tggatatccaagatctgaa 

atatccaagatctgaaaaa 



819 838SEQIDNO: 

820 839SEQ ID NO: 

892 911SEQIDNO: 

893 912SEQIDNO: 
916 935SEQIDNO: 
924 943SEQ ID NO: 

997 lOIBsEQIDNO: 

998 1017SEQIDNO: 
1002 1021SEQIDNO: 
1031 1050SEQIDNO: 
1090 1109SEQIDNO: 
1094 1113SEQIDNO: 
1110 1129SEQIDNO: 
1112 1131SEQIDN0: 
1117 1136SEQ1DNO: 
1132 1151SEQ ID NO: 
1162 1181 SEQ ID NO: 
1174 1193SEQ1DNO: 
1188 1207SEQIDNO: 
1204 1223SEQIDNO: 

1210 1229SEQIDNO: 

1211 1230SEQIDNO: 

1218 1237SEQIDNO: 

1219 1238SEQIDNO: 
1248 1267SEQIDNO: 
1332 1351SEQ1DNO: 
1349 1368SEQIDNO: 
1440 1459SEQ1DNO: 
1480 1499SEQIDNO: 
1516 1535QEQIDNO: 
1546 1565SEQIDNO: 
1548 1567SEQIDNO: 
1560 1579SEQ ID NO: 
1610 1629SEQIDNO: 
1618 1637SEQIDNO: 
1629 1648SEQIDNO: 
1703 1722SEQ1DNO: 
1738 1757SEQIDNO: 
1744 1763SEQ ID NO: 
1759 1778SEQ ID NO: 
1781 ISOOsEQIDNO: 
1796 1815SEQIDNO: 
1890 1909SEQIDNO: 
1910 1929SEQIDNO: 
1913 1932SEQIDNO: 
1924 1943SEQ1DNO: 

1929 1948SEQIDNO: 

1930 1949SEQIDNO: 
1935 1954SEQIDNO: 
1938 1957SEQ1DNO: 



^ r tn 3 n t fi n ntttptcaaaa 


12453 


12472 1 


4 


4.1*^4 nntna ntnnntttfltcaaQ 


12452 


12471 1 


4 


4.'^'? ^ ttnnsi a tn a n ptr!?i ta o ct 


3813 


3832 1 


4 




3812 


3831 1 


4 


"-rOk?/ y Ldyy cJcJiaicitaiyy cay eaay 


9461 


9480 1 


4 


^0»jO iLdLiybjiy cjca ivji^cjooay 


13656 


13675 1 


4 




1669 


1688 1 


4 


''vJKjsj uciCiC!^ vn_«nLv^ci o ly o l 


1663 

1 W 


1 687 1 


4 


^.'^R'l traraaator'tttncifiiat 


9675 


9694 1 


4 




2077 


2096 1 


4 

1 


A '^R'^rtnnls^^rtartttaflifinrl 


5495 


5514 1 


4 




13192 


13211 1 


4 


Zl'^RR a+nnnatttttiTir'fliflntt 
H'OO'J q ty y >-»C4HH.Ltyi-«e7CTmt 


14014 


14033 1 


4 


TfOOO ay rally raiyyyuay ii^rfOy 


4572 


4591 1 


4 


4'^R7 pf f*fl a anaatosptttttt 


2578 


2597 1 


4 




12209 


12228 1 


4 


A^f\Q rtranfltr^^asnttastttG 


12273 


12292 1 


4 


A'^YO nannntan'tpaitfiaranta 


10337 


1 0356 1 


4 




12580 


1 2599 1 


4 


to t ^ iyy*rft«»B*w«ieiy'^»iyyciv*t 


8866 


8885 1 


4 


4,^ 7*^ r a actaacctcatcaaciat 


2169 


2188 1 


4 


4^)74 tcaact CI acctcatcaaaa 


2168 


2187 1 


4 


4'^7'i anrttanarnaaaactaaca 


13903 


13982 1 


4 




11248 


11267 1 


4 


4'^77apf'anatfiPtciaacaQtaa 


8148 


8167 1 


4 


4*^78 arr acltacaactaaaciaa 

"Tw I U awwOwUMweiy wfcfc«y s*y y y 


10824 


10843 1 


4 


4^79 aaaca acctaaattataac 

"TW « »y y y y i-»y duiwLtit«y ny *y 


3439 


3458 1 


4 


43Rn ta a eta ataacctaaaaaa 


5586 


5605 1 


4 


aatcctttataattatatc 


12355 


12374 1 


4 


4'^R? ttcGcaaaaacaatcaaca 


9938 


9957 1 


4 


4383 ttcaaatccatacaaatca 


10917 


1 0936 1 


4 


4^A4trttnaaracaaaatcaQt 

toot LwLLy CiClwClwCicmy luny l 


6007 


6026 1 


4 


4^R'^aaatnaaacitaaaaatcat 


8118 


8137 1 


4 


4'^RRrmntaaannaaaacttaat 


9024 


9043 1 


4 


4*^87 aaattactaaaaaaactac 


13727 


13746 1 


4 


4*^88 atfnfiatatppaaaatcta 


1933 


1952 1 


4 


A'^RQr^n atn 9 f*f*tPP an pfppt fl 


2485 


2504 1 


4 


tOOU Liiy ddaLawida ly li'i^iy y 


5519 


5538 1 


4 


4'^Q'I naaaaapttnaaaacaacc 

toe 1 y aaaciciuiLy yctaci\«c4ovw 


4439 


4458 1 


4 


*Hoy 4 ag a aiCCa g aiaCaay add 


68Q3 


6912 1 

^^^^ 1 I 


4 


4393cagcatgcctagtttctcc 


9952 


9971 1 


4 


4394tcaatatGaaaagcccagc 


12045 


12064 1 


4 


4395 atatctggaaccttgaagt 


10737 


10756 1 


4 


4396 ctgaactcagaaggatggc 


14000 


14019 1 


4 


4397 cttccattctgaatatatt 


13378 


1 3397 1 


4 


4398 g ataaaagattaclttgag 


7273 


7292 1 


4 


4399 tcttcaatttattcttctt 


13825 


13844 1 


4 


4400 atcttcaatttattcttct 


13824 


1 3843 1 


4 


4401 ttcacataccagaattcca 


8325 


6344 1 


1 4 


4402tttttaaccagtcagatat 


10185 


10204 ' 


1 
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SEQ ID NO: 


3063 


tatccaagatctgaaaaag 


1939 


1958SEQIDNO: 


4403 ctttttaaccagtcagata 


10184 


1 0203 1 


4 


SEQ ID NO: 


3064 


caagatctg aaaaagttag 


1943 


1962SEQIDNO: 


4404 ctaaattcccatggtcttg 


4973 


4992 1 


4 


SEQ ID NO: 


3055 


aagatctgaaaaagttagt 


1944 


1963 SEQ ID NO: 


4406 actaaattcccatg gtctt 


4972 


4991 1 


4 


SEQ ID NO: 


3085 


tgaaaaagttagtgaaaga 


1950 


1969 jD NO: 


4406 Ictttctcgg gaatattca 


10630 


10649 1 


4 


SEQ ID NO: 


3067 


tccaactgtcatggacttc 


1990 


2009 SEQ ID NO: 


4407 gaagcacatatgaactgga 


13945 


13964 1 


4 


SEQ ID NO: 


3068 


tcagaaaattctctcggaa 


2007 


2026 SEQ ID NO: 


4408 ttcctttaacaattcctga 


9501 


9520 1 


4 


SEQ ID NO: 


3069 


ttccatcacttgacccagc 


2052 


2071 SEQ ID NO: 


4409 gctgacatagggaatggaa 


8441 


8460 1 


4 


SEQ ID NO: 


3070 


ccca g cctcagccaaaata 


2065 


2084 SEQ ID NO: 


441 Otattctatccaagattggg 


7820 


7839 1 


4 


SEQ ID NO: 


3071 


a g cctcs g cca a a atagsa 


2068 


2087 SEQ ID NO: 


441 1 ttctatccaagatigggct 


7822 


7841 1 


4 


SEQ ID NO: 


3072 


atctlatalttgatccaaa 


2091 


2110SEQ ID NO: 


4412tttgaaaaacaaagcagat 


11821 


1 1 840 1 


4 


SEQ ID NO: 


3073 


tcttatatttgatccaaat 


2092 


21 11 SEQ ID NO: 


441 3 attttttgcaagttaaaga 


14019 


14038 1 


4 


SEQ ID NO: 


3074 


cttcctaaagaaagcatgc 


2117 


2136SEQIDNO: 


44 1 4 g catgg cattatg atg aag 


3614 


^3 ^5 "1 


4 


SEQ ID NO: 


3075 


ctaaagaaagcatgctgaa 


2121 


2140SEQID NO: 


441 Sttcagggtgtggagtttag 


5694 


5^13 1 


4 


SEQ ID NO: 


3076 


taaagaaagcatgctgaaa 


2122 


2141 SEQ ID NO: 


441 6tttcttaaacattccttta 


9490 


9509 1 


4 


SEQ ID NO: 


3077 


gagattggcttggaaggaa 


2183 


22D2SEQ ID NO: 


441 7ttccctGcattaagttctc 


11709 


11728 1 


4 


SEQ ID NO: 


3078 


ctttgagccaacattggaa 


220S 


2225SEQ ID NO: 


441 8ttccaatgaccaagaaaag 


11068 


11087 1 


4 


SEQ ID NO: 


3079 


cagacagtgtcaacaa ag c 


2253 


2272SEQ ID NO: 


441 9gcttactggacgaactctg 


6142 


6161 1 


4 


SEQ ID NO: 


3080 


cagtgtcascaaagctttg 


2257 


2276SEQ ID NO: 


4420 caaattcctggatacactg 


9857 


9876 1 


4 


SEQ ID NO: 


3081 


agtgtcaacaaag ctttgt 


2258 


2277SEQ ID NO: 


4421 acaagaatacgtctacact 


4359 


4378 1 


A 

4 


SEQ ID NO: 


3082 


ctgatggtgtdctaaggt 


2298 


2317SEQ ID NO: 


4422 acctcgg aacaatcctcag 


3333 


3352 1 


A 


SEQ ID NO: 


3083 


tgatggtgtctctaaggtc 


2299 


2318SEQ ID NO: 


4423gacctgcgcaacgagatca 


8831 


8850 1 


A 

4 


SEQ ID NO: 


3084 


aaacatg ag cag gatatgg 


2351 


2370SEQ ID NO: 


4424 ccatg atctacatttgttt 


6796 


6815 1 


A 

4 


SEQ ID NO: 


3085 


gaagctg attaaagatttg 


2395 


2414SEQ ID NO: 


4425caaaaacattttcaacttc 


5287 


5306 1 


A 

4 


SEQ ID NO: 


3086 


aaagatttgaaatccaaag 


2405 


2424SEQ ID NO: 


4426ctttaagttcagcatcttt 


7614 


7633 1 


A 

4 


SEQ ID NO: 


3087 


gatgggtgcccgcactctg 


2518 


2537s EQ ID NO: 


4427cagatttg aggattccatc 


7983 


^^^^ J 

8002 1 


4 


SEQ ID NO: 


3088 


gggatcccccagatgattg 


2540 


2559SEQ ID NO: 


4428caatcacaagtcgattccc 


9083 


9102 1 


A 

4 


SEQ ID NO: 


3089 


ttttcttcactacatcttc 


2593 


2612SEQ ID NO: 


4429 gaagtglcagtg g caaaaa 


10382 


10401 1 


A 

4 


SEQ ID NO: 


3090 


tcttcactacaicttcatg 


2596 


2615SEQ ID NO: 


4430catggcattatgatgaaga 


3515 


3634 1 


4 


SEQ ID NO: 


3091 


tacatcttcatggagaatg 


2603 


2622SEQ ID NO: 


4431 cattatg gaggcccatgta 


9445 


9464 1 


4 


SEQ ID NO: 


3092 


ttcatggagaatgcctttg 


2609 


2628SEQ ID NO: 


4432caaaatcaactttaatgaa 


6607 


6626 1 


4 


SEQ ID NO: 


3093 


tcatggagaatgcctttga 


2610 


2629SEQ ID NO: 


4433 tcaacacaatcttca atg a 


13116 


13135 1 


A 

4 


SEQ ID NO: 


3094 


tttgaactccccactggag 


2624 


2643SEQ ID NO: 


4434ctccccaggacctttcaaa 


9842 


-A A 

9861 1 


A 

4 


SEQ ID NO: 


3095 


ttgaactccccactggagc 


2625 


2644 SEQ ID NO: 


4435gctccccaggacctltcaa 


9841 


9860 1 


A 

4 


SEQ ID NO: 


3096 


tgaactccccactggagct 


2626 


2645 SEQ ID NO: 


4436 agctccccaggacctttca 


9840 


9859 1 


4 


SEQ ID NO: 


3097 


cactggagctggattacag 


2635 


2654sEa ID NO: 


4437ctgtttctgagtcccagtg 


9344 


r\ A A 

9363 1 


4 


SEQ ID NO: 


3098 


actggagctggattacagt 


2636 


2655 SEQ ID NO: 


4438 actgtttctg agtcccagt 


9343 


9362 1 


4 


SEQ ID NO: 


3099 


agttgcaaatatcttcatc 


2652 


2671 SEQ ID NO: 


4439 gatgatgccaaaatcaact 


6699 


6618 1 


A 

4 


SEQ ID NO: 


3100 


gttgcaaatatcttcatct 


2653 


2672 SEQ ID NO: 


4440 agatgatgccaaaatcaac 


6598 


6617 1 


4 


SEQ ID MO: 


3101 


a aatatcttcatctg gagt 


2658 


2677SEQ ID NO: 


4441 actcagaaggatggcattt 


14004 


14023 1 


4 


SEQ ID NO: 


3102 


taaaactggaagtagccaa 


2703 


2722 SEQ ID NO: 


4442ttggttacaggaggcttta 


7600 


7619 1 


4 


SEQ ID NO: 


3103 


ggctgaactggtggcaaaa 


2728 


2747 SEQ ID NO: 


4443ttttcttttcagcccagcc 


9228 


9247 1 


4 


SEQ ID NO: 


3104 


tgtggagtttgtgacaaat 


2758 


2777 SEQ ID NO: 


4444 attttcaagcaaatgcaca 


8538 


8557 1 


4 


SEQ ID NO: 


3105 


ttgtgacaaatatgg goat 


2766 


2785 SEQ ID NO: 


4445 atgcgtctaccttacacaa 


9521 


9540 1 


4 


SEQ ID NO: 


3106 


atgaacaccaacttcttcc 


2819 


2838sEQ!DNO: 


4446 g gaagctgaagtttatcat 


2877 


2896 1 


4 


SEQ ID NO: 


3107 


cttccacgagtcgggtctg 


2833 


2852 SEQ ID NO: 


4447cagagctatcactgggaag 


5235 


5254 1 


4 


SEQ ID NO: 


3108 


gagtcgggtctggaggctc 


2840 


2859SEQ ID NO: 


4448 gagcttactggacgaactc 


6140 


6159 1 


4 


SEQ ID NO: 


3109 


cctaaaagctgggaagctg 


2866 


2885 SEQ ID NO: 


4449cagcctccccagccgtagg 


12120 


12139 1 


4 


SEQ ID NO: 


3110 


agctgggaagctgaagttt 


2872 


2891 SEQ ID NO: 


4450 aaacfgttaatttacagct 


5463 


5482 1 


4 


SEQ ID NO: 


3111 


ccagsttagagctggaact 


3114 


3133SEQ ID NO: 


4451 agtttccggggaaacctgg 


12726 


12745 1 


4 


SEQ ID NO: 


3112 


ggataccctgaagtttgta 


3208 


3227 SEQ ID NO: 


4452 tacagtattctgaaaatcc 


8393 


8412 1 


4 
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SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO: 
SEQ ID NO; 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO; 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO; 
SEQ ID NO: 
SEQ ID NO; 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO; 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



3113 ctgaggctaccatgacatt 

3114 tgtccagtgaagtccaaat 

3115 aattccggattttgatgtt 

3116 ttccggattttgatgttga 

3117 cggaacaatoctcagagtt 

3118 tcctcagagttaatgatga 
3i 1 9 ctcaccGtggacattcaga 

3120 cattcagaacaagaaaatt 

3121 actgaggtcgccctcatgg 

3122 ttattlccataccccgttl 

3 1 23 gtftgcaagcagaagccag 

3 1 24 tttgcaagcagaagccaga 

3125 ttgcaagcagaagccagaa 

3126 ctgcttctccaaatggact 

3127 tgctacagcttatggctcc 

3128 acagcttatggctccacag 

3129 tttccaagagggtggcatg 
3-130 ccaagagggtggcatggca 

3131 gtggcatggcattatgatg 

3132 tgatgaagagaagattgaa 

3133 gaagagaagattgaatttg 

3 1 34 gagaagattgaatttgaat 

3135 tttgaatggaacacaggca 

3136 aggcaccaatgtagatacc 

3137 caaaaaaatgacttccaat 

3138 aaaaaaatgacttccaatt 

3139 aaaaaatgacttccaattt 

3140 cagagtccctcaaacagac 

3141 aaattaatagttgcaatga 

3 1 42 ttcaacctccagaacatgg 

3143 tgggattgccagacttcca 

3144 cagtttgaaaattgagatt 

3 1 45 gaaaattgagattcctttg 

3 1 46 tttgccttttggtggcaaa 

31 47 ctccagagatctaaagatg 
3143 tctaaagatgttagagact 

31 49 ctgtgggattccatctgcc 

3150 atctgccatctcgagagtt 

3151 tctcgagagttccaagtcc 

3 1 52 agtccctacltttaccatt 

3153 actlttaccattcccaagt 

3154 cattcccaagttgtatcaa 
3156 accacatgaaggctgactc 
3 1 56 tttcctacaatgtgcaagg 
31 67 ctggagaaacaacatatga 

3158 atcatgtgatgggtcfcta 

31 59 catgtgatgggtctctacg 

3160 ttctagattcgaatatcaa 

3161 tggggaccacagatgtctg 

3 1 62 ctaacactggccggctcaa 



3252 3271 SEQ ID NO 
3297 3316SEQIDNO 
3313 3332SEQIDNO 
3315 3334SEQIDNO 
3337 3366SEQ ID NO 
3345 3364SEQ ID NO 
3392 3411 SEQ ID NO; 
3403 3422SEQIDNO: 
3422 3441SEQIDNO: 
3486 3505SEQIDNO: 

3501 3520SEQ ID NO: 

3502 3521SEQIDNO: 

3503 3522SEQ ID NO: 
3554 3573QEQIDNO: 
3577 3596s EQ ID NO* 
3581 3600SEQIDNO: 
3600 3619SEQIDNO: 
3603 3622SEQIDNO: 
3611 3630SEQIDNO: 
3625 3644SEQ |d NO: 
3629 3648SEQIDNO: 
3632 3651 qeq |d NO: 
3644 3663sEQ ID NO: 
3658 3677SEQ ID NO: 

3676 3695QEQIDNO: 

3677 3696SEQ ID NO' 

3678 3697SEQIDNO: 
3760 3779SEQ ID NO: 
3803 3822SEQ id NO: 
3899 3918SEQIDNO: 
3915 3934SEQIDNO: 
3994 4013SEQIDNO: 
4000 4019SEQIDNO: 
4015 4034SEQIDNO: 
4036 4055SEQIDNO: 
4045 4064SEQIDNO- 
4092 41 11 ID NO: 
4104 4123SEQIDNO: 
4112 4131 SEQ ID NO: 
4126 4145SEQIDNO: 
4133 4152SEQIDNO: 
4141 41603EQIDNO: 
4284 4303SEQIDNO: 
4317 4336sEQ ID NO: 
4338 4357SEQ ID NO: 
4378 4397SEQIDNO: 
4380 4399SEQIDNO- 
4407 4426SEQ ID NO: 
4499 4518SEQIDNO: 
4644 4663aEaiDNO: 



4453 aatgagctcatggcttcag 

4454 attttgagaggaatcgaca 

4455 aacacatgaatcacaaatt 

4456tcaaaacgagcttcaggaa 

4457aacttgtacaaclggtccg 

4458tcatca8ttggttacagga 

4459tctgcagaacaatgctgag 

4460aattgactttgtagaaatg 

4461 ccatgcaagtcagcccagt 

4462 aaactgcctatattgataa 

4463 ctg gacttctcttcaaaac 

4464tctgggtgtcgacagcaaa 

4465ttctgggtgtcgacagcaa 

4466agtcaagattgatgggcag 

4467ggaggctttaagttcagca 

4468 ctgtatagcaaattcctgt 

4469catggacttcttctggaaa 

4470tgcccaQcaagcaagttgg 

4471 catccttaacaccttccac 

4472ttcactgttcctgaaatca 

4473 caaaaacattttcaacttc 

4474 attcata atccca actctc 

4475tgcctttgtgtacaccaaa 

447Sggtaacctaaaaggagcct 

4477attgaagtacctacttttg . 

4478 aattgaagtacctactttt 

4479 aaatccaatctcctctttt 

4480 gtctgtgggattccatctg 

4481 tcataagttcaatgaattt 

4482 ccattgaccagatgctgaa 

4483 tggaaatgggcctgcccca 
4484 aatcacaaclcctccactg 
4485 caaaactaccacacatttc 
4486tttgagaggaatcgacaaa 

4487 catcaattggttacag gag 

4488 agtccttcatgtccctaga 
4489ggcattttgaaaaaaacag 
4490 aactctcaaaccctaagat 
4491 ggacattcctctagcgaga 
4492 aatgaatacagccagg act 
4493actttgtagaaatgaaagt 
4494ttgaaggacttcaggaatg 
4495 gagtaaaccaaaacttggt 
4496 cctttaacaattcctg aaa 
4497tcattctgggtctttccag 
4498tagaattacagaaaatgat 
4499cgtaggcaccgtgggcatg 
4500ttgatgatgctgtcaagaa 
4501 cagaattccagcttcccca 
4502ttgaggctattgatgttag 



3817 


3836 


1 4 


6357 


6376 


1 4 


8938 


8957 


1 4 


13207 


13226 


1 4 


4211 


4230 


1 4 


7593 


7612 


1 4 


12439 


12458 


1 4 


8104 


8123 


1 4 


10924 


10943 


1 4 


13880 


13899 


1 4 


5408 


5427 


1 4 


5272 


5291 


1 4 


5271 


5290 


1 4 


4567 


4586 


1 4 


7609 


7628 


1 4 


5897 


5916 


1 4 


8877 


8896 


1 4 


9361 


9380 


1 4 


8071 


8090 


1 4 


7871 


7890 


1 4 


5287 


5306 


1 4 


8278 


8297 


1 4 


11236 


11255 ' 


1 4 


5591 


5610 ' 


1 4 


8366 


8385 ' 


1 4 


8365 


8384 ' 


1 4 


8406 


8425 ' 


1 4 


4090 


4109 


1 4 


13186 


1 3205 1 


4 


8142 


8161 1 


1 4 


8903 


8922 1 


4 


9541 


9560 1 


4 


13694 


13713 1 


4 


6359 


6378 1 


4 


7594 


7613 1 


4 


10033 


10052 1 


4 


9735 


9754 1 


4 


8556 


8575 1 


4 


8215 


8234 1 


4 


6086 


6105 1 


4 


8109 


8128 1 


4 


12009 


12028 1 


4 


9024 


9043 1 


4 


9503 


9522 1 


4 


11035 


11054 1 


4 


6565 


6584 1 


4 


12133 


12152 1 


4 


7308 


7327 1 


4 


8334 


8353 1 


4 


6984 


7003 1 


4 
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SEQ ID NO: 
SEQ ID NO: 

SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO; 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



3163 
3164 

3165 

3166 

3167 

3168 

3169 

3170 

3171 

3172 

3173 

3174 

3175 

3176 

3177 

3178 

3179 

3180 

3131 

3182 

3183 

3184 

3185 

3186 

3187 

3188 

3189 

3190 

3191 

3192 

3193 

3194 

3195 

3196 

3197 

3198 

3199 

3200 

3201 

3202 

3203 

3204 

3205 

3206 

3207 

3208 

3209 

3210 

3211 

3212 



taacactggccggctcaat 

aacactggccggctcaatg 

ctggccggctcaatggaga 

agataacag g aagatatg a 

tccctcacctccacctctg 

ag ctg actttaaa atctg a 

ctg actttaaaatctgaca 

caa g atggatatgaccttc 

gctgcgttctgaalatcag 

cgttctgaatatcagg ctg 

aattcccatggtcttgagt 

tggtcttgagttaaatgct 

ctlgagttaaatgctgaca 

ttgagitaaatgctgacat 

tgagttaaatgctgacato 

acttg a agtgtagtctcct 

agtgtagtctcctggtgct 

gtgctggagaatgagctga 

ctggggcatctatgaaatt 

atggccgcttcagggaaca 

ttcagtctggatgggaaag 

ccatgattctgggtgtcga 

aaaacattttcaacttcaa 

cttaagctctcaaatgaca 

ttaagctctcaaatgacat 

catgatgggctcatatgct 

tgggctcatatgctgaaat 

actgg acttctcttcaaaa 

acttctcttcaaaacttga 

ctgacaagttttataagca 

aagttttataagcaaactg 

ctgttaatttacagctaca 

ttacagctacagccctatt 

tctg gtaactactttaaac 

tttaaacagtgacctgaaa 

ttaaacagtg acctgaaat 

cagtgacctgaaatacaat 

tgtggctggtaacctaaaa 

ttatcagcaagctataaag 

ggttcagggtgtggagttt 

attcagactcactgcattt 

ttcagactcactgcatttc 

tacaaatggcaatgggaaa 

gctgtatagcaaattcctg 

tgagcagacaggcacdgg 

g g cacctg g aaactcaaga 

tgaatacagccaggacttg 

g aatacagccaggacttgg 

ctggacgaactctggctga 

ttttactcagtQagcccat 



4645 4664SEQ ID NO: 

4646 4665SEQIDNO: 
4650 4669SEQIDNO: 
4713 4732SEQIDNO: 
4745 4764SEQ ID NO: 
4818 4837SEQ1DNO: 
4820 4839 SEQ ID NO: 
4873 4892SEQIDNO: 
4909 4928SEQ ID NO: 
4913 4932SEQIDNO: 
4976 4995$EQ ID NO: 
4984 5003SEQ ID NO: 

4988 5007SEQ ID NO: 

4989 5008SEQIDNO: 

4990 5009 SEQ ID NO: 
5094 5113SEQ ID NO: 
5100 5119SEQ ID NO: 
5114 5133SEQIDNO: 
5151 5170SEQIDNO: 
6178 6197SEQIDNO: 
5207 5226SEQIDNO: 
5265 5284SEQIDNO: 
5289 5308 SEQ ID NO: 

5324 5343SEQIDNO: 

5325 5344SEQIDNO: 
5341 5350SEQIDNO: 
5346 5365SEQ ID NO: 
5407 5426SEQ ID NO; 
5412 5431 SEQ ID NO: 
5445 5464SEQ ID NO: 
5450 5469s EQ ID NO: 
5466 5485SEQ ID NO: 
5474 5493SEQ ID NO: 
5494 5513SEQID NO; 

5506 5525SEQ ID NO: 

5507 5526SEQ ID NO; 
5512 5531 SEQ ID NO: 
5584 5603sEQiDNO: 
6667 5676SEQIDNO: 
5692 5711 SEQ ID NO: 

5775 5794SEQ ID NO: 

5776 5795SEQIDNO: 
5848 5867SEQ1DNO: 
5896 5916SEQIDNO: 
6043 6062SEQIDNO: 
6053 6072SEQIDNO: 

6088 6107SEQIDNO: 

6089 6108SEQIDNO: 
6147 6166SEQIDNO: 
6201 6220SEQIDNO: 



4503 attgaggctattgatgtta 
4504cattgaggctattgatgtt 
4505tctccatctgcgctaccag 
4506tcatctcctttcttcatct 
4507cagatatatatctcaggga 
4508tcaggctcttcagaaagct 
4509tglcaagataaacaatcag 
451 Ogsagtagtactgcatcttg 
451 1 ctgagtcccagtgcccagc 
451 2cagcaagtacctgagaacg 
45 1 3 actcagatcaaagttaatt 
4514agcacagtacgaaaaacca 
451 Stgtccctagaaaictcaag 
451 6atgtccctagaaatctcaa 

45 1 7 g atggaaccctctccctca 

451 8 aggaaactcagatcaaagt 
451 9agcagccagtggcaccact 
4520 tcagccag gtttatagcac 

4521 aatttctgattaccaccag 

4522 tgttttttggaaatgccat 
4523 ctttgacaggcattttgaa 
4524tcgatgcacatacaaatgg 
4525ttgatgttagagtgclttl 
4526 tgtcctacaacaagttaag 
4527atgtcctacaacaagttaa 

4528 agcatctttggctcacatg 

4529 atttatcaaaagaagccca 
4530ttttggcaagctatacagt 
4531 tcaattgggagagacaagt 
4532tgctttgtgagtttatcag 
4533cagtcatgtagaaaaactt 
4534tgtactggaaaacgtacag 
4535 aatattgatcaatttgtaa 
4536 gtttgaaaaacaaagcaga 
4537tttcatttgaaagaataaa 
4538atttcaagcaagaacttaa 
4539attggcgtggagcttactg 
4540ttttgctggagaagccaca 
4541 ctttgcactatgttcataa 
4542aaacacctaagagtaaacc 
4543 aaatg ctgacataggg aat 
4544 gaaatattatgaacttgaa 
4545tttcctaaagctggatgta 
4546 caggtccatgcaagtcagc 
4547 ccag cttccccacatctca 
4648tcttcgtgtttcaactgcc 
4549 caagtaagtgctaggttca 

4550 ccaacacttacttgaatlc 

4551 tcagaaagdaccttccag 
4552 atggacttcttctgg aaaa 



6983 
6982 
12073 
10210 
8184 
7930 
8740 
6843 
9350 
8511 
12272 
10809 
10042 
10041 
4733 
12267 
12514 
7734 
13579 
8649 
9727 
5838 
6993 
7255 
7254 
7624 
12942 
8380 
6504 
9693 
4429 
6338 
6425 
11820 
7032 
10434 
5131 
10765 
12764 
9014 
8437 
13312 
11176 
10919 
8341 
11221 
9380 
10668 
7939 
8878 



7002 
7001 
12092 
10229 
8203 
7949 
8759 
6862 
9369 
8630 
12291 
10828 
10061 
10060 
4752 
12286 
12533 
7753 
13598 
8668 
9746 
5857 
7012 
7274 
7273 
7643 
12961 
8399 
6523 
9712 
4448 
6407 
6444 
11839 
7051 
10453 
6150 
10784 
12783 
9033 
8456 
13331 
11195 
10938 
8360 
11240 
9399 
10687 
7958 
8897 
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otU lU NL 


>: 3213 






6260 SEQ ID NO 


1: 4553ctcatctcctttcttcatc 


10209 


10228 


1 4 


obU ID NO 


: 3214 


a a tf tt rt f^t+f t n + a o ;a n 

ddiivjiiyoLuigiciaay 


b2/7 


6296 SEQ ID NO 


1 : 4554 cttttclaaacttgaaatt 


9064 


9083 


1 4 


ccr/^ in 


3215 


^llliy laciel^lcll^elLcIa 


oZoo 


6304 SEQ ID NO 


: 4555ttatgaacttgaagaaaag 


13318 


13337 


1 4 


ocU lU NC 


: 3216 


tttnf SI 9 9 rt totnata oata 


bZo/ 


6306SEQ ID NO 


: 4556ttttcacattagatgcaaa 


8421 


8440 


1 4 


otw ID NC 


: 3217 


tCUdlla 3 CUlCUua tin 


D02U 


6339 SEQ ID NO 


: 4557aaaa1tgatgatatctgga 


10727 


10746 


1 4 


obU ID NU 


: 321 3 


p 3 tt a a r*t/^ r*r^ Q + H++ 


OO^ 1 


6340 SEQ ID NO 


: 4558aaaagggtcatggaaatgg 


8893 


8912 


1 4 


otlj ID IMU 


32 1 9 


f^l'f nr^aa n a sifatf +f n o n 

•L' I Ly VjCics £3 a lea 1 1 1 Ly a Lj 


Do40 


6365 SEQ ID NO 


; 4559ctcaattttgattttcaag 


8528 


8547 


1 4 


oEQ ID NO 


I 3220 


(dgaciiauLigBydygddl 


o3o2 


6371 SEQ ID NO 


: 4560attccctccattaagttct 


11708 


11727 


1 4 


ocU ID NU 


3221 


d 1 Ld (dy L ly LclL>Ly y d dd 


oooU 


6399 SEQ ID NO 


4561 tttcaagcaagaacttaat 


10435 


10454 


1 4 






^ d d y ^.id Lwct a la Ll^ d I 


O^iO 


6434SEQ ID NO 


: 4562atcagttcagataaacttc 


7999 


8018 


1 4 


ceo m MO 
otU ID NU 


3223 


d^dlOdd IdllgdlCddll 


Q4ZU 


6439SEQ ID NO 


4563aattccctgaagttgatgt 


11487 


11506 


1 4 


ocU ID NU 




n S S 9 9 i^f/^/^/^a n Q o <~t 
^ o a ci d l.« l^^^cll^d y U a d y U 


u-^oO 


^484sEQ ID NO 


: 4564gctttcxttccacatttc 


10060 


10079 


1 4 


OCU ID l\U 




^'■H^'^Li'^'di.Lucia iLy y y 




65T3SEQ ID NO 


. 4565cccatttacagatcttcag 


11371 


11390 


1 4 


Qco in K\r\ 




I 6id I d 1 1,^ d d 1 L y u u d 


c>*+yo 


0514SEQ ID NO 


: 4566tcccalttacagatcttca 


11370 


11389 


1 4 


QPO ir> MO 


o2Z7 


ddtiiydUl^CrlULOdLtddd 


004U 


6559SEQ ID NO 


: 4567tttgaggattccatcagtt 


7987 


8006 


1 4 


QCO in MO 
ocU ID NU 


3228 


aoddyidldyddtldl^ayd 


OOOo 


o577sEQ ID NO 


I 4568tctggGtccctcaactttt 


9050 


9069 


1 4 


cco m MO 
otiU ID IMU 


: 3229 


d&(..rddwincldiy dddadU 


DDI 1 


6630SEQ ID NO 


: 4569gtttattgaaaatattgat 


6811 


6830 


1 4 


QPO in MO 
oCW IU INU 


o2oU 


t n sitf tna a a ata nnf est + 
L^diLi^ddodidy Lfidl I 




6713SEQ ID NO 


: 4570aatattattgalgaaatca 


6716 


6735 


1 4 


QPPk in MO 

OCU ID INU 




aiiiyaddaldy L>Ld uyo 


uoyD 


S^'^^sEQ ID NO 


; 4571gcaagaacttaatggaaat 


10441 


10460 


1 4 


^P/^ In MO 
C>IZVa2 ID NU 


o2o2 


ct L Ly ^,»ieici Id I Id L l^d ly 


Or IU 


6729 SEQ ID NO 


4572catcacactgaataccaat 


10159 


10178 


1 4 


in MO 

OCVjt IU NU 




jjciaaaaLidciddciyiLFlLy 


Of Of 


6756SEQ ID NO 


: 4573caagsgcttatgggatttc 


11161 


11180 


1 4 


QPO In MO 
otii^ ID NU 


o2o4 


LL»d Ld iiiA^y ly id dL 




O'olsEQ ID NO 


4574attactttgagaaattagt 


7281 


7300 


1 4 


in MO 

oCU IU INU 


o2oiD 


laiiy diLLiddOddddy 1, 


DOilO 


6842SEQ ID NO 


4575acltgacttcagagaaata 


114D4 


11423 ' 


1 4 


cno m MO 
otU ID NU 


3236 


r^Tnoaj^j^a /•i^'Hr'i ^ /t i~t *^ 

L.Ly uayucjyc^uday agac 


CO"! /I 


6933SEQ ID NO 


4576 gtctlcaglgaagctgcag 


10699 


10718 ' 


1 4 


QPr^ in MO 
OtzU ID NU 


>32o / 


fia a a r*a a Pa r'at+na /in/-»4 
CI d a d Lrd d Od Lrd Liy dy y CI 


oy /o 


S992SEQ ID NO 


4577agcctcacctcttactttt 


10571 


10590 - 


I 4 


Qpo in MO- 

OCVdj IU NU, 


oZoo 


iiy ay ly L(.«ddd(.rdOll 




7078SEQ ID NO, 


4578 aagtagclgagaaaatcaa 


7104 


7123 1 


4 


opo in MO- 


oZj9 


iii^aay ldy^.»iy ay elclad 


f lUU 


'119SEQ ID NO: 


4579ttttcacattagatgcaaa 


8421 


8440 ' 


1 4 


Qco in Mo< 

OCW IU NU. 


o24u 


iidy idy dy uy yowjclUC 


7*100 

/ 1 yy 


■7OH Q ^ 

7218SEQ ID NO: 


4580ggtggactcttgctgctaa 


7776 


7796 ' 


4 


Qpo 1 Pi mo- 
OCU IU NU. 




tnaannanar^'attr'anaa 
lydayydydULdlluaydd 




7246SEQ ID NO: 


4581 ttctcaattttgattttca 


8526 


8545 1 


4 


oco in MO- 
ocu IU NU. 


o24Z 


vjciy ctoia LLLidyddyoLdd 


70*30 


7251 SEQ ID NO: 


45 82 ttagccacagctctgtctc 


10301 


10320 1 


4 


QPO m MO- 




fiatianttnnattlattna 


/ Zoo 


^3^2gEQ ID NO: 


4583 tcaag aagcttaatgaatt 


7320 


7339 1 


4 


QCO m MO- 
OCW IU NU. 


o244 


y wLidaiyddiidiCrLLii 


70 ov 


7346SEQ ID NO: 


4584aaaacgagcttcaggaagc 


13209 


13228 1 


4 


QFO in MO* 


02.40 


ttaacaa attnf*itnanat 

imavrfaaailv.A.riiy d^dl 


70ftc 
/ OD3 


7^84sEQ ID NO: 


4585 atgtcctacaacaagttaa 


7254 


7273 1 


4 


Qpo in MO' 


Oi^4o 


a aatt afl a ntr^ aftf n aff 




^41 3 SEQ ID NO: 


4586aatcctttgacaggcattt 


9723 


9742 1 


4 


opo in MO' 

OIZW IU IMU. 


dZ47 


n 3 f4^a a t n n t n a a a t tf~" a 
^dv^ioddiyy lydddllud 


/ 4o4 


7/1 Q'3 ^ . 

7483SEQ ID NO; 


4587 tg aa attcaatcacaagtc 


9076 


9095 1 


4 


cpo in MO- 

IU INU. 


00>ID 

o<£4o 


n aaattpann pfotnn a ap 

y aaai iVi^ay y LrlLriyy ddC 




7494SEQ ID NO: 


4588 gttctcaattttgattttc 


8525 


8544 1 


4 


In MO' 


oz4y 


a Vi^ L cl^^^i^ a o a a a □ a y o Ly d d 




^^^^^SEQ ID NO: 


4589ttcaggaactattgctagt 


10645 


10664 1 


4 


Qpo in MO- 

OCIW IU NU. 




cca a a ataafv^tta afpa t 


/Or 0 


7597SEQ ID 


4590 atgatttccclgaccttgg 


10950 


10969 1 


4 


eco in NO* 




aaataapnffaafoatr^an 


/ OO 1 


'^oOOsEQ ID NO: 


A t?^\ ^ J-JL _ L » i « 

4591 ttgaagtaaaagaaaattt 


10749 


10768 1 


4 


SEO ID NO* 




tttaacittcaacatcttta 




/OJ4SEQ ID NO; 


4592 caaatctggatttcttaaa 


9480 


9499 1 


4 


SEQ ID NO: 


3253 


caggtttatagcacacttg 


7739 


7758SEQ ID NO: 


4593 caagggttcactgttcctg 


7865 


7884 1 


4 


SEQ ID NO: 


3254 


gttcactgttcctgaaatc 


7870 


7889SEQ ID NO: 


4594 gattctcagatgagggaac 


8922 


8941 1 


4 


SEQ ID NO: 


3255 


cactgttcctgaaatcaag 


7873 


7892 SEQ ID NO: 


4595 cttgaacacaaagtcagtg 


6008 


6027 1 


4 


SEQ ID NO: 


3256 


actgttcctg aaatcaa g a 


7874 


7893 SEQ ID NO: 


4596 tcttgaacacaaagtcagt 


6007 


6026 1 


4 


SEQ ID NO: 


3257 


gcctgcctttgaagtcagt 


7909 


7928SEQIDNO: 


4597actgttgactcaggaaggc 


12580 


12599 1 


4 


SEQ ID NO: 


3258 


taacagatttgaggattcc 


7980 


7999SEQ1DNO: 


4598ggaagcttctcaagagtta 


13222 


13241 1 


4 


SEQ ID NO: 


3259 


gttttccacaccagaattt 


8050 


8069 SEQ ID NO: 


4599 aaatttctctgctggaaac 


9418 


9437 1 


4 


SEQ ID NO: 


3260 


tcagaaccattgaccagat 


8 1 36 


8155sEQiDNO: 


4600 atctgcagaacaatgctga 


12438 


12457 1 


4 


SEQ ID NO: 


3261 


tagcgagaatcaccctgcc 


8226 


8245 SEQ ID NO: 


4601 ggcagcttctggctlgcta 


12301 


1 2320 1 


4 


SEQ ID NO: 


3262 


ccttaatgattttcaagtt 


8299 


8318SEQIDNO: 


4602 aactgttgactcaggaagg 


12579 


12598 1 


4 
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SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID MO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



3263 


acataccagaattccagct 


3264 


aatgctgacatagggaatg 


3266 


atgctgacatagggaatgg 


3266 


aaccacctcagcaaacgaa 


3267 


agcagglatcgcagcttcc 


3268 


tgcacaactctcaaaccct 


3269 


aggagtcagtgaagtlctc 


3270 


tttttggaaatgccatlga 


3271 


aatggagtgattgtcaaga 


3272 


gtcaagataa acaatcag c 


3273 


tccacaaattg aacatocc 


3274 


ttgaacatccccaaactgg 


3275 


acatccccaaactggactt 


3276 


acttctctagtcagg ctga 


3277 


tgaaicacaaattagttic 


3278 


ag aaggacccctcacucc 


3279 


ttg g actg Iccaataag at 


3280 


a cigicca aia a g a tcaai 


3281 


ctgtccaataagatcaata 


3282 


gtttatgaatctgg ctccc 


3283 


atgaatctggctccctcaa 


3284 


ctcaacttttctaaacng 


3285 


ctaaa g g catg g cactgtt 


3286 


aaggcatggcactgtttgg 


3287 


atcca caa acaatg a a g g g 


3288 


ggaatttgaaagttcgttt 


3289 


aataactatg cactgtttc 


3290 


gaaacaacgagaacattat 


3291 


ttcttgaaaacgacaaagc 


3292 


ataagaaaaacaaacacag 


3293 


aaaacaaacacag g cattc 


3294 


gcattccatcacaaatcct 


3296 


tttgaaaaaaacagaaaca 


3296 


caatg cattagaittigtc 


3297 


caaagctgaaaaatctcag 


3298 


cctg g atacactgttccag 


3299 


gttgaagtgtctccanca 


3300 


tttctccatccta g g itcx 


3301 


ttctccatcctaggttctg 


3302 


tcattagagctgccagtcc 


3303 


tgctg aactttttaaocag 


3304 


ctcctttcttcatcttcat 


3305 


tgtcattgatgcactgcag 


3305 


tgatgcactgcagtacaaa 


3307 


agctctgtctctgagcaac 


3308 


agccgaaattccaattttg 


3309 


ttgagaatgaatttcaagc 


3310 


aaacctactgtctcttcct 


3311 


tacttttccattgagtcat 


3312 


tcaggtccatgcaagtcag 



8328 8347SEQ ID NO: 

8438 8457sEa ID NO: 

8439 8458 SEQ ID NO: 
8458 8477 SEQ ID NO: 
8476 8495 SEQ ID NO: 
8551 8570SEQIDNO; 
8592 8611SEQIDNO: 
8652 8671 SEQ ID NO: 
8729 8748SEQIDNO: 
8741 8760SEQIDNO: 
8787 8806SEQIDNO: 
8795 8814SEQIDNO: 
8799 8818SEQIDNO: 
8814 3833SEQIDNO: 
8944 B963SEQIDNO: 
8968 8987SEQIDNO: 
8988 9007SEQIDNO: 

8992 9011 SEQ ID NO: 

8993 901 2 SEQ |D NO: 
9041 9060SEQ1DNO: 
9045 9064SEQ ID NO: 
9059 9078SEQIDNO: 
9129 9148SEQ1DNO: 
9132 9151 SEQ ID NO: 
9262 9281SEQIDNO: 
9279 9298QEQIDNO: 
9332 9351SEQIDNO: 
9432 9451SEQIDNO: 
9599 9618SEQIDNO: 
9648 9667SEQIDNO: 
9654 9673SEQ1DNO: 
9667 9686SEQIDNO: 
9740 9759SEQIDNO: 
9767 9776SEQ1DNO: 
9817 9836sEa ID NO: 
9863 9882SEQIDNO: 
9890 9909SEQ ID NO: 

9964 9983SEQIDNO: 

9965 9984SEQ ID NO: 
1001910038SEQIDNO: 
10177 10196SEQ |dNO: 
10214 10233SEQIDNO: 
10234 10253 SEQ ID NO: 
1024010259SEQIDNO: 
10309 10328 SEQ ID NO: 
10408 10427SEQIDNO: 
10424 10443 SEQ ID nq: 
104S9 10488SEQID NO: 
10583 10602 SEQ ID NO: 
10918 10937SEQIDNO: 



4603agctgccagtccttcatgt 
4604 cattaatcctgccatcatt 
4605 ccatttgag atcacggcat 
4606ttcgttttccattaaggtt 
4607 ggaagtggccctgaatgct 
46D8agggaaagagaagattgca 
4609 g ag aacttactatcatcct 
461 Otcaatgaatttattcaaaa 

461 1 tcttttcagcccagccatt 

461 2 gctgaclttaaaatctgac 
461 3gggatttcctaaagctgga 
461 4ccagtttccagggactcaa 
461 5aagtcgattcccagcatgt 
4616tcagatggaaaaatgaagt 
461 7gaaagtccataatggttca 
461 Sggaagaagaggcagcttct 
461 9atctaaalgcagtagccaa 

4620 attgataaaaccatacagl 

4621 tattgataaaaccatacag 
4622gggaatctgatgaggaaac 
4623 ttg agttgcccaccatcat 
4624 caagatcgcagactttgag 
4625aacagaaacaatgcattag 
4626ccaagaaaaggcacacctt 
4627 ccctaacagatttgag gat 
4628aaacaaacacaggcattcc 
4629 gaaatactgttttcctatt 
4630ataaac1gcaagatttttc 
4631 gctttccaatgaccaagaa 
4632ctgtgctttgtgagtttat 
4633 ga atttg aaagttcgtttt 
4634aggaagtggccctgaatgc 
4635 tgitgaaagatttatcaaa 
4636gacaagaaaaaggggattg 
4637 ctgagaacttcaicatttg 
4638 ctg g acttctctagtcag g 
4639tgaatctggctccctcaac 
4640agaatccagatacaagaaa 
4641 cagaatccagatacaagaa 
4342ggacagtgaaatattatga 

4643 ctggatgtaaccaccagca 

4644 atgaagcttgctccaggag 

4645 ctgcgctaccagaaagaca 
4646tttgagttgcccaccatca 

4647 gttgaccacaagcttagct 

4648 caaagctggcaccagggct 

4649 gcttcaggaagcttctcaa 
4650aggaaggccaagccagtl;t 
4661 atgattatgtcaacaagta 
4652 ctgacatcttaggcactga 





1 0045 1 


4 




10024 1 


4 




Q2fi4 1 


4 




0*^10 1 


4 




lOQQI 1 


4 


1 0«JU 1 




4 


■ or oo 


1 3807 1 


4 




1 1 ij 1 


4 




Q9*=in 1 


4 




4838 1 


4 




11191 1 


4 




1 2622 1 


4 


0090 


9109 1 


4 


11010 


11099 1 


4 

r 


12817 


12836 1 


4 


12:^92 


12311 1 


4 


11634 


1 1 653 1 


4 


13891 


13910 1 


4 


13890 


1 3909 1 


4 


12255 


12274 1 


4 


1 1667 


11686 1 


1 4 


1 1653 


11672 1 


1 4 


9749 


9768 


1 4 


1 1077 


11096 


1 4 


7977 


7996 


\ 4 


Q655 


9674 


1 4 


12836 


12856 


1 4 


i3fina 


1 3627 


1 4 




11084 


1 4 


9690 


9709 


1 4 

1 r 


9280 


9299 


1 4 


10971 


10990 


1 4 


12933 

1 £m Val ^mJ 


12952 

1 Vp/ <v 


1 4 


10279 


10298 


1 4 


11438 


11457 


1 4 


8810 


8829 


1 4 


9046 


9065 


1 4 


6893 


6912 


1 4 


6892 


6911 


1 4 


loouo 






11186 


11205 


1 4 


13772 


13791 


1 4 


12080 


12099 


1 4 


11666 


11685 


1 4 


10547 


10566 


1 4 


13971 


13990 


1 4 


13216 


13235 


1 4 


12591 


12610 


1 4 


12363 


12382 


1 4 


5001 


5020 


1 4 
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SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO; 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO; 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



33 1 3 atgcaagtcagcccagttc 

3314 tgaatgctaacactaagaa 
3316 agaagatcagatggaaaaa 

3316 ggctattcattctccatcc 

3317 aaagttttggctgataaat 

33 1 8 agttltggctgataaattc 

3319 ctgggctgaaactaaatga 

3320 cagagaaatacaaatctat 

3321 gaggtaaaattccctgaag 

3322 cttttttgagataaccgtg 

3323 gctggaattgtcattcctt 

3324 gtgtataatgccacttgga 

3325 attccacatgcagctcaac 

3326 tgaagaagatggcaaattl 

3327 atcaaaagcccagcgttca 

3328 gtgggcatggatatggatg 

3329 aaatggaacttctactaca 

3330 aaaaactcaccatattcaa 

3331 ctgagaagaaatctgcaga 

3332 acaatgctgagtgggttta 

3333 caatgctgagtgggtttat 

3334 ttaggcaaattgatgatal 
3336 ataaactaatagatgtaat 

3336 ccaactaatagaagataac 

3337 ttaatlatatccaagatga 

3338 tttaaattgttgaaagaaa 

3339 aagttcaatgaatttattc 

3340 ttgaagaaaagatagtcag 

3341 acttccattctgaatatat 

3342 cacagaaatattcaggaat 

3343 ccattgcgacgaagaaaat 

3344 tataaactgcaagattttt 

3345 tctgattactatgaaaaat 

3346 ggagttactgaaaaagctg 

3347 tgaagctlgctccaggaga 

3348 tgaactggacctgcaccaa 

3349 ttgctaaacttgggggagg 

3350 gattcgaatatcaaattca 

3351 atttgtttgtcaaagaagt 

3352 tctcggttgctgccgctga 

3353 gctgaggagcccgcccagc 

3354 ctggtctgtccaaaagalg 

3355 ctgagagttccagtggagt 

3356 cagtgcaccctgaaagagg 

3357 ctctgaggagtttgctgca 

3358 acatcaagaggggcatcat 

3359 ctgatcagcagcagccagt 

3360 ggacgctaagaggaagcat 

3361 agctgttttgaagactctc 

3362 tgaaaaaactaaccatctc 



10926 10945SEQ ID NO: 
10983 11 002SEQ ID NO; 
11004 11 023sEQ ID NO: 
11264 11283SEQ ID NO; 
11288 11 307SEQ ID NO: 
11290 11 309SEQ ID NO: 
11316 11 335SEQ ID NO: 
11413 11432SEQ ID NO: 
11480 11499SEQ ID NO: 
11545 11564SEQ ID NO: 
11735 11 754SEQ ID NO: 
11795 11814SEQ ID NO: 
11859 11 878SEQ ID NO: 
11992 12011SEQ ID NO: 
12050 12069SEQ ID NO: 
12143 12162SEQ ID NO: 
12179 12198SEQ ID NO: 
12219 12238SEQ ID NO: 
12428 12447SEQ ID NO: 
12447 12466SEQ ID NO; 
12448 12467SEQ ID NO: 
12477 12496SEQ ID NO: 
12897 12916SEQ ID NO: 
1303913058SEQIDNO: 
13095 131 14sEQ id NO: 
13151 13170SEQ ID NO: 
13190 13209SEQ ID NO: 
13326 13345SEQIDNO: 
13377 13396 SEQ ID NO: 
13451 13470SEQ ID NO: 
1366013579SEQIDNO: 
13607 13626SEQ ID NO; 
1363713656SEQIDNO: 
1372613745SEQIDNO: 
1377313792SEQIDNO: 
1395513974SEQIDNO: 
14058 14077 SEQ ID NO: 
4412 4431 SEQ ID NO: 
4661 4570SEQIDNO: 
33 52SEQIDNO: 
47 66 SEQ ID NO: 
227 246 SEQ ID NO: 
291 310SEQIDNO: 
404 423sEQ ID NO: 
472 491 SEQ ID NO: 
582 601 SEQ ID NO: 
830 849SEQIDNO: 
865 884 SEQ ID NO: 
1087 1106SEQIDNO: 
1113 1132SEQIDNO: 



4653gaactcagaaggatggcat 

4654ttctcaattttgattttca 

4655 ttttctaaatgg aacttct 

4656ggatctaaatgcagtagcc 

4657atttcttaaacattccttt 

4658 gaatcig gctccctcaact 

4659tcattctgggtctttccag 

4660atagcatggacttcttd;g 

4662 cttctggcttgctaacctc 

4663 cacggagttactgaaaaag 

4664aaggcatctccacctcagc 

4665tccaagatgagatcaacac 

4666 gttg ag aagccccaagaat 

4667aaattctcitttcttttca 

4668tgaaagtcaagcatctgat 

4669 catccttaacaccttccac 

4670tgtaccataagccatattt 

4671 ttgatgttagagtgctttt 

4672tctgcacagaaatattcag 

4673taaatggagtctttattgt 

46 74 ataaalg g agtctttattg 

4675 atattgtcagtgcctctaa 

4676 attactatgaaaaatttat 

4677 gttattttgctaaacttg g 

4678tcatcctctaattttttaa 

4679tttcatttgaaagaataaa 

4680gaataccaatgctgaactt 

4681 ctgagagaagtgtcttcaa 

4682 atatctggaaccttgaagt 
4683 attccctgaagttg atgtg 
4684atttttattcctgccatgg 
4685 aaaattcaaactgcctata 

4686 atttgtaagaaaatacaga 

4687 cagcatgcctagtttctcc 
4688 tctcctttcttcatcttca 
4689 ttggtagagcaagggttca 

4690 cctcctacagtggtggcaa 

4691 tgaaaacgacaaagcaatc 
4692 acttttctaaacttgaaat 
4693tcagcccagccatttgaga 
4694 gctggatgtaaccaccagc 
4696 catcagaaccattgaccag 

4696 actcaatggtgaaattcag 

4697 cctcacttcctttg g actg 
4698 tgcaaacttgacttcagag 

4699 atgacgticttgagcatgt 

4700 actggacttctctagtcag 

4701 atgcctacgttccatgtcc 

4702 gagaagtgtcttcaaagct 

4703 gagatcaacacaaicttca 



14002 


14021 


1 


4 


8526 


8545 


1 


4 


12173 


12192 


1 


4 


11632 


11651 


1 


4 


9489 


9508 


1 


4 


9047 


9066 


1 


4 


11035 


1 1 054 


1 


4 


8873 


8892 


1 


4 


12306 


12325 


1 


4 


13723 


13742 


1 


4 


12102 


12121 


1 


4 


13104 


13123 


1 


4 


6264 


6273 


1 


4 


9220 


9239 


1 


4 


12669 


12688 


1 


4 


8071 


8090 




4 


10088 


10107 




4 


6993 


7012 


1 


4 


13447 


13466 


1 


4 


14086 


14105 


1 


4 


14085 


14104 


1 


4 


13392 


13411 


1 


4 


13641 


13660 


1 


4 


14052 


14071 


1 


4 


13800 


13819 


1 


4 


7032 


7051 


1 


4 


10168 


10187 


1 


4 


12407 


12426 


1 


4 


10737 


10756 


1 


4 


11488 


11607 


1 


4 


10103 


10122 


1 


4 


13873 


13892 


1 


4 


6436 


6455 


1 


4 


9952 


9971 


1 


4 


10213 


10232 


1 


4 


7866 


7876 


1 


4 


4230 


4249 


1 


4 


9603 


9622 


3 


3 


9063 


9082 


3 


3 


9236 


9255 


2 


3 


11185 


11204 


2 


3 


8134 


8153 


2 


3 


7465 


7484 


2 


3 


8977 


8996 


2 


3 


11399 


11418 


2 


3 


7050 


7069 


2 


3 


8809 


8828 


2 


3 


11354 


11373 


2 


3 


12411 


12430 


2 


3 


13112 


13131 


2 


3 
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Ou-W IL/ iNVj 


oooo 


ctoSQctciaoannnntpsn 


1 17fi 


1 i»03EQ 


ID NO 


; 4 / \jh cig aaiiacLgcaccicag 


3035 


3054 


2 


3 


opn in MO 

OCW IL^ INvJ 






1 '^l "1 


io^U3EQ 


ID NO 


4 / uo iiggiagagcaagggttca 


7856 


7875 


2 


3 


ciFO In MO 


oooo 


cctta t atcf CO Rf CI a rcs 


1440 




ID NO 


; H-r uoiggcacigiiiggagaagg 


9138 


9157 


2 


3 


CiFO in MO 


OODD 


aaaaactartnaarattnp 


1^00 

1 <_JUU 


10 1 WSEQ 


ID NO 


; ti-zu / gcaagicagcccagncci 


10928 


10947 


2 


3 


9Fn in MO 


000 / 


atttaattctacaaatcat 


1575 




ID NO 


; 4/ uodigsiaacuaaigacaaaat 


7428 


7447 


2 


3 


cpo in MO 

OL.W It-J iMW 


ooOo 


tccaa a actcaaatcttca 




n fi4K ^ r** j*^ 

idDsEQ 


ID NO 


• 4/uc7igaaoiacaaigcicigga 


5520 


5539 


2 


3 


ftPO in MO 




oottcttcttcaa actttc 


1744 


1 f DosEQ 


ID NO 


• ^1 lugaaaiaccaagicaaaacc 


10455 


1 0474 


2 


3 


cpn in MO 




attcf ata 30 osaf ncttca 


1810 


lOdSySEQ 


ID NO 


• -H-/ 1 1 igaaaaagctgcaaiCdcic 


13734 


A Oi jT" 


2 


3 


cpn in MO 


00 f 1 


icc33ci atcto s ?^ r=s r=ii n tt 


1 941 

1 1 


1 i^ausEQ 


ID NO 


; "H-f 1 .^aacigciiciccaaaigya 


3552 


3571 


2 


3 


<^po in MO 


00 r js: 


3Qttaal'oaaaaaaattrt 


1956 




ID NO 


2 "4-/^ 1 ocsyaaucaiaaicccaacT 


6275 


8294 


2 


3 


cpn in MO 


Oo r 0 


oaaa a □aatcttatatita 


2084 


1 »-'oSEQ 


ID NO 


; 4f i^caoaacciaciyicicnc 


10467 


10486 


2 


3 


9FO in MO 




Qaaaoctctttttaaaaaa 

53 57 *^ Sy ^ ^* ^^y y y « « y 


2221 


224.norrr-\ 
^^HUQEQ 


ID NO 


<n 7n K ^"^TT^^^^^^^tf^ ^^^^^ 

1 H-r 1 uLallCaCaiaC/CayaanCC 


oo24 


8343 


2 


3 


<^FO in Mn 




f OQaataatactca atott 


2374 




ID NO 


; T- f 1 Dc!aLFcaacaL.aOaygcailCCa 


C\ fit FZ G 


9675 


2 


3 


^po in MO 


00 / 0 


QatttaaaatnnaRRnpi=?io 




SEQ 


ID NO 


>^7n 7 /^4"t^Ql*rt ^^/^/^I'-f^ i^'^^^^'^ 

; 1 / oiicaigiccciagaaatc 


10037 


10056 


2 


3 


SFO in MO 

UL.Vj< IL^ I\1w 


. 00 t f 


tccasaasaatcccaaaaa 


2417 


2436 oc/^ 


ID NO 


• HI 1 oL>uuaguuigciiici.gga 


4aoi 


49 /^U 


2 


3 


SFO in wn 


00 fO 


acioaaaaactcaaaaaatG 


2570 


2589oc:/^ 


ID NO 


- H-/ 1 czCraiiagaguiyccoyicci 


1UU20 


10039 


2 


3 


^po in MO 


'^'^7Q 

•JO / y 


aa a ata acttttttcttca 


2583 


9Rn9oi-r\ 

^ou-^SEQ 


ID NO 


1 ^/ ^uiy crifagatgacgaciiTici 


12160 


I2l79 


2 


3 


SFO in MO 


OOOU 


tttatQacaaatataaaca 


2765 


^'^^SEQ 


ID NO 


1 'H' / ^ 1 igccag Liigaaaaacaaa 


1 I0IO 


1 1834 


2 


3 


SFO in NO< 


000 1 


ctciaQQctaccataacatt 


3252 


3271 ocr\ 


ID NO 


tr ^^dciiyLLfCiyuLoiigiiuay 


luyoo 


10922 


2 


3 


9Fn in MO- 




Qtaaataccaaaaaaataa 


3668 


oaoA SEQ 


ID NO 


H-f ^oicaiugcccicaacciac 


1 1450 


11469 


2 


3 


SFO in NO' 


OOOO 


aaata acttccaatttccc 


3681 


O^ UUSEQ 


ID NO 


*t/^*tgggaacigiigaaagaui 


1 <iy27 


12946 


2 


0 

3 


SFO in MO 




ataacttccaatttcccta 


3683 


O/U-tlSEQ 


ID NO 


Hf /lOcaggagaaciiaciaicaT 


1 o7oO 


13804 


2 


3 


opn in MO 


OOOO 


atctaccatctcaaaaatt 


4104 

1 1 w 1 


^ '^*^SEQ 


ID NO: 


H-^^oaacicciccacigaaagai 


yo47 


9566 


2 


3 


Spn in NO- 

IL/ IMw, 


OOOQ 


atttatttatcaaaa aaat 


4551 


4570 ocr\ 
*tof ug^Q 


ID NO: 


HI ^1 auiiccgLnaccagaaai 


00>I7 


8266 


2 


3 


qpn in NO- 


000 / 


a caa aa cttaacctctcta 


5135 




ID NO: 


*#/ iiocagagciiicigccacigc 


13o1o 


^ 0 C 0 "7 

13537 


2 


3 


fiFO in MO 

OC-W IL' l^KJ 


OOOO 


atato eta aaataaaattt 


5353 


oof i^SEQ 


1 r*\ Ik 1 . 

ID NO: 


*t / ^yaaaiicaaacigccraiat 


1 oo74 


1 o89o 


2 


3 


SFO in MO- 


oooy 


tea a a a cttaaca a cattt 


5420 




ID NO: 


Ha ouddaLaCUCCacaaaLlya 


o7o0 


8799 


2 


3 


opn in MO' 

wl^W lU INwi 


009U 


caataacctaaaatacaat 


5512 

WW 1 ^ 


000 1 SEQ 


ID NO: 


HI o \ aiigaacaiccccaaactg 


8794 


8813 


2 


3 


QFO in MO' 

IL/ INV^. 


OOO 1 


ta ca aat Q a caata a a aa a 


5848 


^00 ' SEQ 


ID NO: 


4/ o^iucaacigccnigigia 


1 1229 


1 1248 


2 


3 


SFO in NO* 

IL/ lAlW, 


OOi?^ 


cttttotaaaatataataa 


6285 


63n4oc/-\ 
aouH-sEQ 


ID NO: 


*!■ 00 1 laiig cig aa icca a a a g 


1 ODOb 


1od7o 


2 


3 


SFO in MO- 


OOc^O 


ttataaaatataataaaaa 


6288 


xjouf SEQ 


ID NO: 


H-/ OH-Liiicaaguaaaigcacaa 


oDoy 


0 cc 0 

o5oo 


2 


3 


SFO in NO* 


0053 


tccatta acctcccatttt 


6320 




ID NO: 


H-r ou dadagaaaaniigcigga 


1 U f 00 


1U/70 


2 


3 


SFO in NO- 


OOS73 


aattatctaaattcattca 


6488 


6507oc/^ 


ID NO: 


HI oolijaayiagaccaacaaaic 


7*1 ftO 


7'1 0 A 


2 


0 
0 


SFO in NO- 

VJt^W IL/ IN\^. 


OO9O 


aattaaaaaaaacaaattt 


6506 


^'-'^'^bEQ 


ID NO: 


tr 0/ ctadCldadiydlClaaaU 


1 1 oZ4 


1 \ o4o 


2 


3 


SFO ID NO* 




atttgaaaatagctattgc 


6696 


6715QP=r^ 


ID NO. 


•+/ 00 yoctctiiLoiyodC/ayaaai 


1 o44 1 


1 o4dU 


2 


0 


SFO ID NO' 




tqagcatqtcaaacacttt 


7060 

f 


7079Qcr\ 


ID NQ: 


*-r / 023 aaa^l.>L>dlLOeiy LLfLUlUd 


1 007-1 


A onon 


2. 


0 
0 


SEO ID NO' 


OOC7E7 


ttgaagatqttaacaaatt 


7356 


7375 ocr^ 


ir> M/^i 

ID Nu: 


^ f dd llU^dLd ly dddy iCaa 


•1 ooon 
1 ^DDU 


H 0^37Q 


2 


0 


SFQ ID NO' 




acttgtcacctacatttct 


7753 


7772 Q CM 


ID NO: 


^ 1 dyddLdllLiydLUUddy I 


1 04/0 


1 o^^o 


2 


0 
0 


SFO in MO 

IL/ INw, 


OS-U 1 


Qttttccacaccaaaattt 


8050 




ID NO: 


H- f "H-^ dddiuiggaiiicnaaac 


y4oi 


9500 


2 


3 


SFO ID NO' 


Ot-U^ 


ataa dtacaaccaaaattt 


9405 


9424 oc/^ 


ID NO: 


H-/ H-o dddiaaaiggagicuiai 


1 4UC)0 


AAA /^O 

14102 


2 


3 


SFO ID NO' 


0*frUO 


CQQQacctacaaaactaaa 


p 

w 


^' oEQ 


ID NO: 


4 / ^4 cicayiiaacigiy icccy 


1 1o71 


•1 H con 
1 1590 




3 


SEQ ID NO: 


3404 


agtgcccttctcggttgct 


25 


44 SEQ 


ID NO: 


4745 agcatctgattgactcact 


12678 


12697 


I 


3 


SEQ ID NO: 


3405 


f 

gctgaggagcccgcccagc 


47 


66SEQ 


ID NO: 


4746gctgattgaggtgtccagc 


1225 


1244 




3 


SEQ ID NO: 


3406 


gaggagcccgcccagccag 


50 


69SEQ 


ID NO: 


4747 ctggatcacagagtccctc 


3752 


3771 




3 


SEQ ID NO: 


3407 


gggccgcgaggccgaggcc 


72 


91 SEQ 


ID NO: 


4748 ggccctg atccccgagccc 


1363 


1382 




3 


SEQ ID NO: 


3408 


ccaggccgcagcccaggag 


89 


108 SEQ 


ID NO: 


4749ctcccggagccaaggctgg 


2682 


2701 




3 


SEQ ID NO: 


3409 


ggagccgccccacx^cagc 


104 


123SEQ 


D NO: 


4750 gctgttttgaagactctcc 


1088 


1107 




3 


SEQ ID NO: 


3410 


gaagaggaaatgctggaaa 


200 


219SEQ 


D NO: 


4751 tttcaagttcctgaccttc 


8309 


8328 




3 


SEQ ID NO: 


3411 


caaaagatgcgacccgatt 


237 


256SEQ 


ID NO: 


4752 aatcttattggggattttg 


7085 


7104 




3 


SEQ ID NO: 


3412 


attcaagcacctccggaag 


253 


272SEQ 


ID NO: 


4753 cttccacatttcaaggaal 


10067 


10086 




3 
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SEQ ID NO 


3413 


gttccagtggagtccctgg 


297 


316SEQ 


ID NO 


4754ccagcaagtacctgagaac 


8610 


8629 


1 3 


SEQ ID NO 


: 3414 


g actg ctgattcaagaagt 


316 


335sEQ 


ID NO 


: 4755acttgaagaaaagatagtc 


13324 


13343 


1 3 


SEQ ID NO 


: 3415 


gtgccaccaggatcaactg 


333 


352SEQ 


ID NO 


; 4756cagtgaagctgcagggcac 


10704 


10723 


1 3 


SEQ ID NO 


: 3416 


gatcaactgcaaggttgag 


343 


362SEQ 


ID NO 


; 4757ctcacctccacctctgatc 


4748 


4767 


1 3 


SEQ ID NO 


: 3417 


actgcaaggttgagctgga 


348 


367SEQ 


ID NO 


: 4758tccactcacatcctccagt 


1289 


1308 


1 3 


SEQ ID NO 


: 3418 


ccagctctgcagcttcatc 


373 


392 SEQ 


ID NO 


: 4759gatgtggtcacctacctgg 


1343 


1362 


1 3 


SEQ ID NO 


: 3419 


agcttcatcctgaagacca 


383 


402SEQ 


ID NO 


: 4760tggtgctggagaatgagct 


5112 


5131 


1 3 


SEQ ID NO 


: 3420 


cttcatcctgaagaccagc 


IS 


404SEQ 


ID NO 


: 4761 gctggagtaaaactggaag 


2696 


2715 


1 3 


SEQ ID NO 


: 3421 


ccagccagtgcaccctgaa 


399 


418SEQ 


ID NO 


: 4762ttcaagatgactgcactgg 


1539 


1558 


1 3 


SEQ ID NO 


: 3422 


cagtgcaccctgaaagagg 


404 


423 SEQ 


ID NO 


: 4763cctcacagagctatcactg 


5230 


5249 


1 3 


SEQ ID NO 


: 3423 


Iggcttcaaccctgagggc 


427 


446 SEQ 


ID NO 


, 4764gcccactggtcgcclgcca 


3533 


3552 


1 3 


SEQ ID NO 


: 3424 


cttcaaccctgagggcaaa 


430 


449SEQ 


ID NO 


; 4765tttgagccaacatlggaag 


2207 


2226 


1 3 


SEQ ID NO 


: 3425 


ttcaaccctgagggcaaag 


431 


450SEQ 


ID NO 


: 4766ctttgacaggcattttgaa 


9727 


9746 


1 3 


SEQ ID NO 


: 3426 


cttgctgaagaaaaccaag 


451 


470SEQ 


ID NO 


: 4767cttgaaattcaatcacaag 


9074 


9093 


1 3 


SEQ ID NO 


: 3427 


tgctgaagaaaaccaagaa 


453 


472SEQ 


ID NO 


; 4768ttctgctgccttatcagca 


5647 


5666 


1 3 


SEQ ID NO 


: 3428 


ttgctgcagccatgtccag 


483 


502SEQ 


ID NO 


; 4769ctggtcagtttgcaagcaa 


3004 


3023 


1 3 


SEQ ID NO 


: 3429 


tgctgcagccatgtccagg 


484 


503SEQ 


ID NO 


; 4770cctgglcagtttgcaagca 


3003 


3022 


1 3 


SEQ ID NO 


3430 


agccatgtccaggtatgag 


490 


509SEQ 


ID NO 


4771 ctcacatcx^ccagtggct 


1293 


1312 


1 3 


SEQ ID NO 


3431 


agctcaagctggccattcc 


507 


526 SEQ 


ID NO 


4772 ggaactaccacaaaaagct 


7489 


7508 ' 


1 3 


SEQ ID NO 


3432 


agaagggaagcaggttttc 


526 


545SEQ 


ID NO 


4773 g a aatcttcaatltattct 


13821 


13840 ' 


1 3 


SEQ ID NO 


3433 


aagggaagcaggttttcct 


528 


547SEQ 


ID NO 


4774aggacaccaaaataacctt 


7572 


7591 • 


1 3 


SEQ ID NO 


3434 


agaaagatgaacctactta 


555 


574SEQ 


ID NO 


4775taagaactttgccacttct 


4852 


4871 ' 


1 3 


SEQ ID NO. 


3435 


atcctg a acatca agagg g 


575 


594SEQ 


ID NO 


4776 ccctaacagatttgag gat 


7977 


7996 ' 


1 3 


SEQ ID NO: 


3436 


tcctgaacatcaagagggg 


576 


595SEQ 


ID NO. 


4777cccctaacagatttgagga 


7976 


7995 ' 


1 3 


SEQ ID NO: 


3437 


ctgaacatcaagaggggca 


578 


597SEQ 


ID NO; 


4778 tg cctgcctttgaagtcag 


7908 


7927 ' 


1 3 


SEQ ID NO: 


3438 


aacatcaagaggggcatca 


581, 


600SEQ 


ID NO: 


4779tgataaaaaccaagatgtt 


6298 


6317 ' 


1 3 


SEQ ID NO: 


3439 


acatcaagaggggcatcat 


582 


601 SEQ 


ID NO: 


4780 atg ataaaaaccaagatgt 


6297 


6316 ' 


1 3 


SEQ ID NO: 


3440 


tcatttctgccctcctggt 


597 


616SEQ 


ID NO: 


4781 accaccagtttgtagatga 


7413 




1 3 


SEQ ID NO: 


3441 


ttcccccagagacagaaga 


615 


634SEQ 


ID NO: 


4782 tcttccacatttcaaggaa 


10066 


1 0085 1 


1 3 


SEQ ID NO: 


3442 


gaagaagccaagcaagtgt 


629 


648SEQ 


ID NO: 


4783 acaccttccacattccttc 


8079 


8098 1 


1 3 


SEQ ID NO: 


3443 


ttgtttctggataccgtgt 


647 


666SEQ 


ID NO: 


4784 acactaaatacttccacaa 


8776 


8794 1 


1 3 


SEQ ID NO: 


3444 


tgtatggaaactgctccac 


663 


682SEQ 


ID NO: 


4785gtggaggcaacacattaca 


2928 


2947 1 


3 


SEQ ID NO: 


3445 


aaactgctccactcacttt 


670 


689SEQ 


ID NO: 


4786aaagaaacagcatttgttt 


4540 


4559 1 


3 


SEQ ID NO: 


3446 


actcactttaccgtcaaga 


680 


699SEQ 


ID NO: 


4787tcttacttttccattgagt 


10580 


10599 1 


1 3 


SEQ ID NO: 


3447 


ctttaccgtcaagacgagg 


685 


704SEQ 


ID NO: 


4788cctccagclcctgggaaag 


2491 


2610 1 


3 


SEQ ID NO: 


3448 


ttaccgtcaagacgaggaa 


687 


706SEQ 


ID NO: 


4789ttcctaaagctggatgtaa 


11177 


11196 1 


3 


SEQ ID NO: 


3449 


acgaggaagggcaatgtgg 


698 


717SEQ 


ID NO: 


4790ccacaagtcatcatctcgt 


5964 


5983 1 


3 


SEQ ID NO: 


3450 


cgaggaagggcaatgtggc 


699 


718SEQ 


ID NO: 


4791 gccagaagtgagatcctcg 


3615 


3534 1 


3 


SEQ ID NO: 


3451 


gaggaagggcaatgtggca 


700 


719SEQ 


ID NO: 


4792tgccagtctccatgacctc 


2476 


2495 1 


3 


SEQ ID NO: 


3452 


ggaagggcaatgtggcaac 


702 


721 SEQ 


ID NO: 


4793gttgctcttaaggacttcc 


13364 


13383 1 


3 


SEQ ID NO: 


3453 


gaagggcaatgtggcaaca 


703 


722SEQ 


ID NO: 


4794tgttgatgaggagtccttc 


1809 


1828 1 


3 


SEQ ID NO: 


3454 


caggcatcagcccacttgc 


777 


796SEQ 


ID NO: 


4795gcaagtctttcctggcctg 


3019 


3038 1 


3 


SEQ ID NO: 


3455 


aggcatcagcccacttgct 


778 


797SEQ 


ID NO: 


4796 agcaagtctttcctggcct 


3018 


3037 1 


3 


SEQ ID NO: 


3456 


tcagcccacttgctctcat 


783 


802SEQ 


ID NO: 


4797atgaaagtcaagcatctga 


12668 


12687 1 


3 


SEQ ID NO: 


3457 


gtcaactctgatcagcagc 


823 


842SEQ 


ID NO: 


4798gctgactttaaaatctgac 


4819 


4838 1 


3 


SEQ ID NO: 


3458 


ggacgctaagaggaagcat 


865 


884SEQ 


ID NO: 


4799 atg cactgtttctgagtcc 


9339 


9358 1 


3 


SEQ ID NO: 


3459 


a ag g ag caacacctcttcc 


902 


921 SEQ 


ID NO: 


4800ggaatal:cttagcatcctt 


13465 


13484 1 


3 


SEQ ID NO: 


3460 


aggagcaacacctcttcct 


903 


922SEQ 


ID NO: 


4801 aggaatatcttagcatcct 


13464 


13483 1 


3 


SEQ ID NO: 


3461 


caacacctcttcctgcctt 


908 


927SEQ 


ID NO: 


4802aaggctgactctgtggttg 


4292 


4311 1 


3 


SEQ ID NO: 


3462 


aacacctcttcctg ccttt 


909 


928SEQ 


ID NO: 


4803aaagcaggccgaagctgtt 


1075 


1 094 1 


3 
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SEQ !D NO: 


3463 


acaagaataagtatgggat 


933 


952SEQ ID NO: 


4804 atccatgatctacatttgt 


6794 


6813 1 


3 


SEQ ID NO: 


3464 


caagaataagtatgggatg 


934 


953SEQ ID NO: 


4805 catcactttacaagccttg 


1246 


1265 1 


3 


SEQ ID NO: 


3465 


tagcacaagtgacacagac 


954 


973SEQ ID NO: 


4806 gtctcttcgttctatgcta 


4592 


4611 1 


3 


SEQ ID NO: 


3466 


agcacaagtgacacagact 


955 


974SEQ ID NO: 


4807 agtctcttcgttctatgct 


4591 


4610 1 


3 


SEQ ID NO: 


3467 


gcacaagtgacacagactt 


956 


975 SEQ ID NO: 


4808 aagtgtagtctcctggtgc 


5099 


5118 1 


3 


SEQ ID NO: 


3468 


aacttgaagacacaccaaa 


978 


997SEQ ID NO: 


4809tttgaggattcGatGagtt 


7987 


8006 1 


3 


SEQ ID NO: 


3469 


gcttctttggtgaaggtac 


10D8 


'1027 SEQ ID NO: 


481 Ogtacctacttttggcaagc 


8372 


8391 1 


3 


SEQ ID NO: 


3470 


ctttggtgaaggtactaag 


1012 


1031 SEQ ID NO: 


481 1 cttatgggatttcctaaag 


11167 


11186 1 


3 


SEQ ID NO: 


3471 


tactaagaagatgggcctc 


1024 


1043sEa ID NO: 


481 2gagggtagtcataacagta 


10337 


10356 1 


3 


SEQ ID NO: 


3472 


tttgagagcaccaaatcca 


1046 


1065 SEQ ID NO: 


481 Slggaagtgtcagtggcaaa 


10380 


1 0399 1 


3 


SEQ ID NO: 


3473 


agagcaccaaalccacatc 


1050 


1069 SEQ ID NO: 


48 1 4 gatggatatgaccttctct 


4876 


4895 1 


3 


SEQ ID NO: 


3474 


agctgttttgaagactctc 


1087 


1106SEQ ID NO: 


481 5gagaacatactgggcagct 


5880 


5899 1 


3 


SEQ ID NO: 


3475 


tgaaaaaactaaccatctc 


1113 


1132SEQ ID NO: 


481 6 gagaaaatcaatgccttca 


7112 


7131 1 


3 


SEQ ID NO; 


3476 


gaassaactssiccBtctct 


1114 


1133SEQ ID NO: 


481 7agagccaggtcgagctttc 


11052 


11071 1 


3 


SEQ ID NO: 


3477 


tctgagcaaaatatccaga 


1130 


1149SEQ ID NO: 


4818tctgatgaggaaactcaga 


12260 


12279 1 


3 


SEQ ID NO: 


3478 


tctcttcaataagctggtt 


1156 


1175SEQ ID NO: 


481 9aacctcccattttttgaga 


6326 


6345 1 


3 


SEQ ID NO: 


3479 


ctgagctgagaggcctcag 


1176 


1195SEQ ID NO: 


4820 ctgatccccgagccctcag 


1367 


1386 1 


3 


SEQ ID NO: 


3480 


tgaagcagtcacatctctc 


1198 


1217SEQ ID NO: 


4821 gagaaaatcaatgccttca 


7112 


7131 1 


3 


SEQ ID NO: 


3481 


aagcagtcacatctctctt 


1200 


1219SEQ ID NO: 


4822 aagaggcagcttctggctt 


12297 


12316 1 


3 


SEQ ID NO: 


3482 


ctctcttgccacagctgat 


1212 


"^231 SEQ ID NO: 


4823 atcaaaagaagcccaagag 


12946 


12965 1 


3 


SEQ ID NO: 


3483 


tcttgccacagctgattga 


1215 


1234SEQ ID NO: 


4d24tcaaagttaattgggaaga 


12279 


12298 1 


3 


SEQ ID NO: 


3484 


cttgccacagctg attgag 


1216 


1235SEQ ID NO: 


4825ctcaattttgattttcaag 


8528 


8547 1 


1 3 


SEQ ID NO: 


3485 


tgaggtgtccagccccatc 


1231 


1250SEQ ID NO: 


4826 gatg g aa ccctctccctca 


4733 


4752 1 


1 3 


SEQ ID NO: 


3486 


tcagtgtggacagcctcag 


1267 


1286SEQ ID NO: 


4827 ctgacatcttaggcactga 


5001 


5020 1 


1 3 


SEQ ID NO: 


3487 


acatcctccagtggctgaa 


1296 


1315SEQ ID NO: 


4828ttcagaagctaagcaatgt 


7230 


7258 ' 


1 3 


SEQ ID NO; 


3488 


gcacagcagctgcgagaga 


1385 


1404SEQ ID NO: 


4829tctctgaaagacaacgtgc 


12323 


12342 ' 


1 3 


SEQ ID NO: 


3489 


cagcagctgcgagagatct 


1388 


1407SEQ ID NO: 


4830agataacattaaacagctg 


13051 


13070 ' 


1 3 


SEQ ID NO: 


3490 


gcgagggatcagcgcagcc 


1415 


1434SEQ ID NO: 


4831 ggctcaacacagacatcgc 


5718 


5737 ' 


I 3 


SEQ ID NO: 


3491 


aagacaaaccctacaggga 


1478 


1497SEQ ID NO: 


4832tcccagaaaacctcttctt 


3936 


3955 ' 


I 3 


SEQ ID NO: 


3492 


caggagctgctggacattg 


1499 


1518SEQIDNO: 


4833caatggagagtccaacctg 


4660 


4679 


1 3 


SEQ ID NO: 


3493 


aggagctgctggacattgc 


1500 


1519SEQ ID NO: 


4834gcaagggttcactgttcct 


7864 


7883 


1 3 


SEQ ID NO: 


3494 


ctg ctg g acattgctaatt 


1505 


1524SEQ ID NO: 


4835aattgggaagaagaggcag 


12287 


12306 ' 


1 3 


SEQ ID NO: 


3495 


gattacacctatttgattc 


1565 


1584SEQ ID NO: 


4836 gaatattttgagaggaatc 


6353 


6372 


1 3 


SEQ ID NO: 


3496 


atttgattctgcgggtcat 


1575 


1594SEQ ID NO: 


4837atgaagtagaccaacaaat 


7161 


7180 


1 3 


SEQ ID NO: 


3497 


tctgcgggtcattggaaat 


1582 


1601 SEQ ID NO: 


4838atttgtaagaaaatacaga 


6436 


6455 


1 3 


SEQ ID NO: 


3498 


aaccatggagcagttaact 


1609 


1628SEQ ID NO: 


4839 agtttctccatcctaggtt 


9962 


9981 


1 3 


SEQ ID NO: 


3499 


ggagcagttaactccagaa 


1615 


1634SEQIDN0: 


4840 ttctgaaaatccaatctcc 


8400 


8419 


1 3 


SEQ ID NO: 


3500 


actccagaactcaagtctt 


1625 


1644SEQ ID NO: 


4841 aagatcgcagactttgagt 


11654 


11673 


1 3 


SEQ ID NO: 


3501 


tccagaactcaagtcttca 


1627 


1646SEQ ID NO: 


4842tgaactcagaagaattgga 


1920 


1939 


1 3 


SEQ ID NO: 


3502 


aagtacaaagccatcactg 


1663 


1682SEQ ID NO: 


4843 cagtcatgtagaaaaactt 


4429 


4448 


1 3 


SEQ ID NO: 


3503 


gccatcactgatgatccag 


1672 


1691 SEQ ID NO: 


4844ctggaactctctccatggc 


10883 


10902 


1 3 


SEQ ID NO: 


3504 


ccatcactgatg atccaga 


1673 


1692SEQIDN0: 


4845tctgaactcagaaggatgg 


13999 


14018 


1 3 


SEQ ID NO: 


3505 


atccagaaagctgccatcc 


1685 


1704SEQ ID NO: 


4846 ggatttcctaaagctggat 


11173 


11192 


1 3 


SEQ ID NO: 


3506 


cagaaagctgccatccagg 


1688 


1707SEQ ID NO: 


4847 cctgaaatacaatgctctg 


5518 


6537 


1 3 


SEQ ID NO: 


3507 


acaaggaccaggaggttct 


1731 


1750SEQ ID NO: 


4848 agaaacagcatttgtttgt 


4542 


4551 


1 3 


SEQ ID NO: 


3508 


aggaccaggaggttcttct 


1734 


1753 SEQ ID NO: 


4849 agaagctaagcaatgtcct 


7242 


7261 


1 3 


SEQ ID NO: 


3509 


accaggaggttcttcttca 


1737 


1766SEQID NO: 


4850tgaaggctgactctgtggt 


4290 


4309 


1 3 


SEQ ID MO: 


3510 


tcttcagactttccttgat 


1750 


1769SEQ ID NO: 


4851 atcaggaagggctcaaaga 


2567 


2586 


1 3 


SEQ ID NO: 


3511 


ttca gactttccttg atg a 


1752 


1771 SEQ ID NO: 


4852tcattactcctgggctgaa 


11307 


11326 


1 3 


SEQ ID NO: 


3512 


gttgatgaggagtccttca 


1810 


1829SEQIDNO: 


4853 tg a atctggctccctcaac 


9046 


9065 


1 3 
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otU lU NU 


3613 






1843SEQIDN0 


: 4854ttaatcgagaggtatgaag 


7148 


7167 


1 


3 


ocU ID NO 


3514 


iLUdcaggcagaiaiiaac 


H Doc 


1844SEQ ID NO 


4855 gttaatcgagaggtatgaa 


7147 


7166 


1 


3 


obQ ID NO 


: 3515 


gyL/agaxanaacaaaan 


11531 


1850SEQ ID NO 


: 4856aattgcattagatgatgcc 


6589 


6608 


1 


3 


bEQ ID NO 


: 3516 


^1 T ^3 TT ^ *^ ^^^^ ^% 

aiaiiaacaaaatigicca 


looo 


1855SEQ ID NO 


: 4857tggagtttgtgacaaatat 


2760 


2779 


1 


3 


obU ID NO 


3517 


dUdaaaugiccaaaiicT 


4 O yJO 


A A 

1861 SEQ ID NO 


: 4858agaaacagcatttgtttgt 


4542 


4561 


1 


3 


otU ID NO 


: 3518 


y ety ca agiy a ag aacnig 


lo/^r 


1896SEQ ID NO 


: 4859caaatgacatgatgggctc 


5334 


5353 


1 


3 


otrU ID NO 


: 3519 


y i-y aoy aacuiyiyycii 


1 ooo 


1902SEQ ID ^jo 


: 4860aagcatctgattgactcac 


12677 


12696 


1 


3 


ocU lU NU 


: 3520 


53 rt O CI f 1 tr^ r« + f /"> "a 

caycsciu.LiigiygcxxccGa 


\oo( 


•1 one 

1906SEQ ID NO 


: 486 i tgggcctgccccagattct 


8909 


8928 


1 


3 


o^/~\ in Kit — t 


: o521 


LiLL) ly y L-ucccaiatig 


loyz 


^91 ^SEQIDNO 


: 4862caataagatcaatagcaaa 


8998 


9017 


1 


3 


ocQ ID NO 


: 3522 


T^m^t^^^^^^^'^^f ^^^^^ n 

lyycucccaiaiigccaa 


109D 


1915SEQIDNO 


: 4663ttggctcacatgaaggcca 


7631 


7650 


1 


3 


obU ID NO 


: 3523 


I louca laiiy ccaaiaic 


1 yuu 


4 nAf\ 

1919SEQ ID NO 


; 4864gatatacactagggaggaa 


12745 


12764 


1 


3 


obU ID NU 


3524 


lULfUra Let iiy ccaaiaici 


1 oni 

1 yul 


1920SEQ ID NO 


: 4865agatcaaagttaattggga 


12276 


12295 


1 


3 


ObU ID NU 


oOZo 


itfl rr*?* isjfaf oHti a car«f 


1 s?uo 


192/sEQ ID NO 


I 4866 gagtcccagtgccx;agcaa 


9352 


9371 


1 


3 


ObU ID NU 


o52o 


ttflfl of Sit i^r* Q dnati^infa 

y d Id Luuaeiyai.i.i>iya 


1 yo4 


1953SEQ ID NO 


: 4867tcagtataagtacaaccaa 


9400 


9419 


1 


3 


ObU ID NU 




if^Oaiflin at ^fn a a Q a a 

luirfOieiy a luiy ddaaay 11 


1 y4 1 


1960SEQ ID NO 


; 4868 aacttccaactgtcatgga 


1986 


2005 


1 


3 


SEQ ID NO 


: 3528 


cigaaaaaguagtgaaag 


1949 


1y68sEQ ID NO 


: 4869ctttgaagtcagtcttcag 


7915 


7934 


1 


3 


oCi^ ip\ Mr^ 
obU ID NU 




c«y iidyigaaayaagxiCT 


1 yob 


1975SEQ ID NO 


: 4870agaatctcaacttccaact 


1978 


1997 


1 


3 


obU ID NU 


3530 


aalULUiaaULlUUdaUigi 


1 you 


1999SEQ ID NO 


: 4871 acaggggtcctttatgatt 


12350 


12369 


1 


3 


obU ID NU 


3531 


y LUd ly y duiicag aaaai 


1 yyr 


2016SEQ ID NO 


4872atttgaaagaataaatgac 


7036 


7055 


1 


3 


obQ ID NO 


3532 


luddoiOLdcaaaicigTi 




2048SEQ ID NO 


4873aacacattgaggctattga 


6978 


6997 


1 


3 


obU ID NO 


3533 


dduiciacaaaicigiiic 


ZUol 


2050SEQ ID NO 


4874gaaaaaggggattgaagtt 


10284 


10303 


1 


3 


obU ID NO; 


; 3534 


s i!Wk\iAr\ oa/^rtrt ^a^t»^t4»^+ 

dddiagaagggaaicuax 


ZU7y 


2098SEQ ID NO 


^ ■ AAA AAA 

4875ataagcaaactgttaattt 


5457 


5476 


1 


3 


obU ID NU: 


3535 


dy ddy y yaaiciiaiaxil 




2102SEQ ID NO: 


4876aaatgcactgctgcgttct 


4900 


4919 


1 


3 


obU ID NU; 


3536 


yadyygdaxcnaxaixig 




2103SEQ ID NO: 


4877caaaaacattttcaacttc 


5287 


5306 


1 


3 


obU ID NU: 


3537 


ly dLOL>d dd laacxaccix 




o H on _ 

2120SEQ ID NO: 


4878aaggaagaaagaaaaatca 


3461 


3480 


1 


3 


otU ID NU: 


o r o o 

3538 


lyyaxiLgcxicagcigac 


0«1 CO 

iil oo 


O-i 7^ 

2177SEQ ID NO: 


4879gtcagcccagttccttcca 


10932 


10951 


1 


3 


ObU iD NU: 


3539 


iL Ly uLLudy ciy accica 


0<1 CO 


21 81 SEQ ID NO: 


4880tgaggaaactcagatcaaa 


12265 


12284 


1 


3 


ObU lU NU. 


3540 


^ityyclayydelaayycnx 


^lyi 


2210SEQ ID NO: 


4881 aaagcattggtagagcaag 


7850 


7869 


1 


3 


ObU ID NO: 


3541 


ly y aay gaaa a g g c ixxga 


isiyo 


oo<i o 

2212SEQ ID NO: 


4882tcaagtctgtgggattcca 


4086 


4105 


1 


3 


obU ID NO: 


3542 


/1/^/^ttt/*1 0/lf^r^Q 

^yuLLLydyucaacaxigg 




oooo 

2223SEQ ID MO: 


4883ccaagaggtatttaaagcc 


12958 


12977 


1 


3 


obU ID NU. 


3543 


i-ydyLtuddCdiiyyaayci 


oono 


OOOO ^ ^ ^ _ - — 

2228SEQ ID NO: 


4884agctttctgccactgctca 


13521 


13540 


1 


3 


obU ID NO. 


3544 


y dy ccdacd Liy gaagcxc 




2229SEQ ID NO: 


4885gagctttctgccactgctc 


13520 


13539 


1 


3 


ObU ID NO: 


3545 


ddLrdLiyyddyClClllXl 


^zio 


2234SEQ ID NO: 


4886aaaagaaacagcatttgtt 


4539 


4558 


"1 


3 


ObU ID INU: 


3546 


LyyddyoLoiiiiigygaa 




2239SEQ ID NO: 


4887ttccggcacgtgggttcca 


3785 


3804 




3 


otU ID NU. 


3547 


oLULiiLiyy yddycaagg 




2245SEQ ID NO: 


4888 ccttactgactttgcagag 


7798 


7817 


1 


3 


ocn in M/^. 
ocU ID NO: 


3548 


iiiLigyyaagcaaggaxi 


OOOQ 


oo AO 

2248 SEQ ID NO: 


4889 aatcattgaaaaattaaaa 


6730 


6749 


1 


3 


ObU ID NO: 


oe yf r> 

3549 


iLiLiAfCdydcagigxcaa 


£.£A( 


2266SEQ ID NO; 


4890ttgatgaaatcattgaaaa 


6723 


6742 


1 


3 


ObU lU INU. 


o cert 

3550 


iiy y oididoodddy diyd 


iCOO 1 


2350SEQ ID NO: 


^ O f\ A X XL ^ X 

4891 tcattgctcccggagccaa 


2676 


2695 


1 


3 


oco in MO. 
obU ID NU: 


3551 


clLdL>OddaydiydlSaaCa 




2356SEQ ID NO: 


4892 tgttgcttttgtaaagtat 


6280 


6299 


1 


3 


ObU ID NU. 


oo52 


ydyLrdyydLdxygiaaaxg 




2376SEQ ID NO: 


4893 catttcagccttcgggctc 


4262 


4281 


1 


3 


QCrt in KIOi 

ObU lU NU: 


3553 


diyyLdddxyyaaxaaxgc 




2385SEQ ID NO: 


4894 gcatgcctagtttctccat 


9954 


9973 


1 


3 




oo54 


i« cid ly y dd IddLy Cl 


^OO / 


2386SEQ ID NO: 


4895 agcacagtacgaaaaacca 


10809 


10828 




3 


SEQ ID NO: 


3555 


taaatggaataatgctcag 


2370 


2389SEQ ID NO: 


4896ctgaaagagatgaaattta 


13067 


13086 




3 


SEQ ID NO: 


3556 


tggaataatgctcagtgtt 


2374 


2393SEQ ID NO: 


4897 aacagatttgaggattcca 


7981 


8000 




3 


SEQ ID NO: 


3557 


tcagtgttgagaagctgat 


2385 


2404SEQ ID NO: 


4898 atcacaactcctccactga 


9542 


9561 




3 


SEQ ID NO: 


3558 


cagtgttgagaagctgatt 


2386 


2405SEQ ID NO: 


4899 aatcacaactcctccactg 


9541 


9560 




3 


SEQ ID NO: 


3559 


agtgttgagaagctgatta 


2387 


2406SEQ ID NO: 


4900taatcacaactcctccact 


9540 


9559 




3 


SEQ ID NO: 


3560 


gattaaagattlgaaatcc 


2401 


2420SEQ ID NO: 


4901 ggatactaagtaccaaatc 


6874 


6893 




3 


SEQ ID NO: 


3561 


gatttgaaatccaaagaag 


2408 


2427SEQ ID NO: 


4902 cttccgtttaccagaaatc 


8248 


8267 


1 


3 


SEQ ID NO: 


3562 


atttgaaatccaaagaagt 


2409 


2428SEQ ID NO: 


4903 acttccgtttaccagaaat 


8247 


8266 


1 


3 
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SEQ ID NO 


3563 


atccaaagaagtcccggaa 


2416 


2435SEQ ID NO 


: 4904ttccaatttccctgtggat 


3688 


3707 


1 3 


SEQ ID MO 


: 3564 


tccaaagaagtcccg gaag 


2417 


2436s EQ ID NO 


; 4905cttccaatttccctgtgga 


3687 


3706 


1 3 


SEQ ID NO 


: 3565 


agagcctacctccgcatct 


2438 


2457SEQ ID NO 


: 4906 agattaatccgctggctct 


8571 


8590 


1 3 


SEQ ID NO 


: 3566 


gagcctacctccgcatctt 


2439 


2458SEQ ID NO 


: 4907 aagatlaatccgctggctc 


8570 


8589 


1 3 


SEQ ID NO 


: 3567 


cttgggagaggagcttggt 


2455 


2474SEQ ID NO 


: 4908accactgggacctaccaag 


12527 


12546 


1 3 


SEQ ID NO 


: 3568 


ggagcttggttttgccagt 


2464 


2483SEQ ID NO 


: 4909actggtggcaaaaccctcc 


2734 


2753 


1 3 


SEQ ID NO 


. 3569 


ttg gttttg ccaglctcca 


2469 


2488SEQ ID NO 


: 4910tggagaagccacactccaa 


10771 


10790 


1 3 


SEQ ID NO 


: 3570 


cagtctccatgacctccag 


2479 


2498 SEQ ID NO 


: 491 1 ctggtcgcctgccaaactg 


3538 


3557 


1 3 


SEQ ID NO 


: 3571 


ctccatgacctccagctcc 


2483 


2502SEQ ID NO 


: 4912ggagtcattgctcccggag 


2672 


2691 


1 3 


SEQ ID NO 


: 3572 


ctgggaaagctgcttctga 


2501 


2520sEQ ID NO 


: 4913tcagaaagctaccttccag 


7939 


7958 


1 3 


SEQ ID NO 


: 3573 


gaggtcatcaggaagggct 


2561 


2580SEQ ID NO 


: 4914agccagaagtgagatcctc 


3514 


3533 


1 3 


SEQ ID NO 


: 3574 


a a g a atgacttttttcttc 


2582 


2601 SEQ ID NO 


4915gaaggcatclgggagtctt 


3835 


3854 


1 3 


SEQ ID NO 


: 3575 


cttttttcttcaclacatc 


2590 


2609SEQ ID NO 


: 4916gatgcttacaacactaaag 


6107 


6126 


1 3 


SEQ ID NO 


: 3576 


catcttcatggagaatgcc 


2605 


2624SEQ ID NO 


: 491 7 ggcacttccaaa attgatg 


10718 


1 0737 • 


1 3 


SEQ ID NO 


: 3577 


cttcatggagaatgccttt 


2608 


2627SEQ ID NO 


: 4918aaagttaattgggaagaag 


12281 


1 2300 


1 3 


SEQ ID NO 


: 3578 


aatgcctttgaactcccca 


2618 


2637SEQ ID NO 


: 491 9tgggctggcttcagccatt 


5737 


5756 


1 3 


SEQ ID NO 


3579 


gcctttg aactccccactg 


2621 


2640SEQ ID NO 


: 4920cagtctgaacattgcaggc 


5383 


5402 


1 3 


SEQ ID NO 


3580 


caaggctggagtaaaactg 


2692 


2711 SEQ ID NO 


: 4921 cagtgcaacgaccaacttg 


5080 


5099 • 


1 3 


SEQ ID NO 


: 3581 


tggagtaaaactggaagta 


2698 


2717SEQ ID NO 


: 4922tactccaacgccagctcca 


3059 


3078 ' 


1 3 


SEQ ID NO 


3582 


ggaagtagccaacatgcag 


2710 


2729SEQ ID NO 


4923ctgccatctcgagagttcc 


4106 


4125 


1 3 


SEQ ID NO 


3583 


tttgtgacaaatatgggca 


2765 


2784SEQ ID NO 


4924tgcctttgtgtacaccaaa 


11236 


11255 ' 


1 3 


SEQ ID NO 


3584 


tgtgacaaatatgggcatc 


2767 


2786SEQ ID NO 


4925gatgggtctctacgccaca 


4385 


4404 ' 


1 3 


SEQ ID NO 


3585 


ggadtcgctaggagtggg 


2794 


2813SEQ ID NO 


4926cccaaggccacaggggtcc 


12341 


12360 ' 


1 3 


SEQ ID NO 


3586 


gtggggtccagatgaacac 


2808 


2827SEQ ID NO 


4927gtgttctagacctctccac 


4179 


4198 ' 


1 3 


SEQ ID NO. 


3587 


ttccacgagtcgggtctgg 


2834 


2853SEQ ID NO: 


4928 ccagaatctgtaccaggaa 


12562 


12581 ' 


1 3 


SEQ ID NO: 


3588 


agtcgggtctggaggctca 


2841 


2860SEQ ID NO: 


4929tgagaactacgagctgacl 


4807 


4826 1 


1 3 


SEQ ID NO: 


3589 


tcgggtctg gaggctcatg 


2843 


2862SEQ ID NO: 


4930 catgaag gcca aattccg a 


7639 


7658 1 


1 3 


SEQ ID NO: 


3590 


aaaagctgggaagctgaag 


2869 


2888SEQ ID NO: 


4931 cttccagacacctgatttt 


7951 


7970 1 


1 3 


SEQ ID NO: 


3591 


aagctgaagtttatcattc 


2879 


2898SEQ ID NO: 


4932 gaatttacaattgttgctt 


6269 


6288 1 


1 3 


SEQ ID NO: 


3592 


gagaccagtcaagctgctc 


2908 


2927SEQ ID NO: 


4933 gagcttcaggaagcttctc 


13214 


13233 1 


3 


SEQ ID NO: 


3593 


gcaacacattacatttggt 


2934 


2953SEQ ID NO: 


4934 accagtcagatattgttgc 


10191 


10210 1 


3 


SEQ ID NO: 


3594 


acattacatttggtctcta 


2939 


2958SEQ ID NO: 


4935tagaatatgaactaaatgt 


11889 


11908 1 


3 


SEQ ID NO: 


3595 


cattacatttggtctctac 


2940 


2959SEQ ID NO: 


4936 gtagctgagaaaatcaatg 


7106 


7125 1 


3 


SEQ ID NO: 


3596 


aaacggaggtgatcccacc 


2964 


2983 SEQ ID NO: 


4937 ggtggataccctgaagttt 


3205 


3224 1 


3 


SEQ ID NO: 


3597 


attgagaacaggcagtcct 


2987 


3006SEQ ID NO: 


4938 aggaaaagcgcacclcaat 


12031 


12050 1 


3 


SEQ ID NO: 


3598 


tgagaacaggcagtcctgg 


2989 


3008 SEQ ID NO: 


4939 ccagcttccccacatctca 


8341 


8360 1 


3 


SEQ ID NO: 


3599 


ctgcacctca g gcg cttac 


3043 


3062SEQ ID NO: 


4940 gtaagaaaatacagagcag 


6440 


6459 1 


3 


SEQ ID NO: 


3600 


tccacag actccg cctcct 


3074 


3093 SEQ ID NO: 


4941 aggacagagccttggtgga 


3192 


3211 1 


3 


SEQ ID NO: 


3601 


ctgaccggggacaccagal 


3101 


3120SEQ ID NO: 


4942 atctgatgaggaaactcag 


12259 


12278 1 


3 


SEQ ID NO: 


3602 


tagagctggaactgaggcc 


3120 


3139SEQ ID NO: 


4943 ggcctctctggggcatcta 


5144 


5163 1 


3 


SEQ ID NO: 


3603 


ctatgagctccagagagag 


3175 


3194SEQ ID NO: 


4944 ctctcacaaaaaagtatag 


6549 


6568 1 


3 


SEQ ID NO: 


3604 


cttggtggataccctgaag 


3202 


3221 SEQ ID NO: 


4945 cttcaggaagcttctcaag 


13217 


13236 1 


3 


SEQ ID NO: 


3605 


ttgtaactcaagcagaagg 


3222 


3241 SEQ ID NO: 


4946 ccttacacaataatcacaa 


9530 


9549 1 


3 


SEQ ID NO: 


3606 


taactcaagcagaaggtgc 


3225 


3244SEQ ID NO; 


4947gcacctagctggaaagtta 


6955 


6974 1 


3 


SEQ ID NO: 


3607 


gcagaaggtgcgaagcaga 


3233 


3252SEQ ID NO: 


4948tctgtgggattccatctgc 


4091 


4110 1 


3 


SEQ ID NO: 


3608 


cagaaggtgcgaagcagac 


3234 


3253SEQ ID NO: 


4949 gtctgtgggattccatctg 


4090 


4109 1 


3 


SEQ ID NO: 


3609 


gtatgaccttgtccagtga 


3288 


3307SEQ ID NO: 


4950tcaccaacggagaacatac 


10851 


10870 1 


3 


SEQ ID NO: 


3610 


tatgaccttgtccagtgaa 


3289 


3308 SEQ ID NO: 


4951 ttcaccaacggagaacata 


10850 


10869 1 


3 


SEQ ID NO: 


3611 


gaagtccaaattccggait 


3305 


3324 SEQ ID NO: 


4952 aatclcaagctttctcttc 


10052 


10071 1 


3 


SEQ ID NO: 


3612 


gagggcaaaacgtcttaca 


3371 


3390SEQ ID NO: 


4953tgtacaactggtccgcctc 


4215 


4234 1 


3 
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SEQ ID NO 


K 3613 


agggcaaaacgtcttacag 


3372 


3391 SEQ ID NO: 


4954 ctgttag gacaccagccct 


4062 


4081 


SEQ ID NO 


: 3614 


gactcaccctggacattca 


3390 


3409 SEQ ID NO: 


4955 tgaaattcaatcacaagtc 


9076 


9095 


SEQ ID NO 


: 3615 


ctggacattcagaacaaga 


3398 


3417SEQID NO: 


4956 tcttttcttttcagcccag 


9226 


9245 


SEQ ID NO 


3616 


tcatgggcgacctaagttg 


3435 


3454 SEQ ID NO: 


4957 caactgcagacatatatga 


6635 


6654 


SEQ ID NO 


: 3617 


tgggcgacctaagttgtga 


3438 


3457SEQ ID NO: 


4958 tcactccattaacctccca 


6316 


6335 


SEQ ID NO 


: 3618 


agttgtgacacaaaggaag 


3449 


3468SEQ ID NO: 


4959 cttcttttccaattgaact 


1 3838 


13857 


SEQ ID NO 


: 3619 


tgacacaaaggaagaaaga 


3454 


3473SEQ ID NO: 


4960 tcttcatcttcatclgtca 


10220 


10239 


SEQ ID NO 


: 3620 


gacacaaaggaagaaagaa 


3455 


3474SEQ ID NO: 


4961 ttcltcatcttcatctgtc 


10219 


10238 


SEQ ID NO 


: 3621 


ggaagaaagaaaaatcaag 


3463 


3482SEQ ID NO: 


4962 cttgtcatgcctacgttcc 


11348 


11367 


SEQ ID NO 


: 3622 


aaaafcaagggtgttattt 


3473 


3492SEQ ID NO: 


4963 aaatcttaitggggatttt 


7084 


7103 


SEQ ID NO 


: 3623 


tccataccccgtttg caag 


3491 


3510SEQ ID NO: 


4964 cttggattcaaaatgtgga 


6858 


6877 


SEQ ID NO 


: 3624 


tgcaagcagaagccagaag 


3504 


3523SEQ ID NO: 


4965 cttcagggaacacaatgca 


5185 


5204 


SEQ ID NO 


: 3625 


cagaagccagaagtgagat 


3510 


3529SEQ ID NO: 


4966 atctatgccatctcttctg 


5633 


5652 


SEQ ID NO 


: 3626 


tgagatcctcgcccactgg 


3523 


3542SEQ ID NO: 


4967 ccagcttccccacatctca 


8341 


8360 


SEQ ID NO 


: 3627 


gglcgcctg ccaaactgct 


3540 


3559SEQ ID NO: 


4968agcacatatgaactggacc 


13947 


13966 


SEQ ID NO 


: 3628 


tgcttctccaaatggactc 


3555 


3574SEQ ID NO: 


4969 gagtttatcagtcagagca 


9701 


9720 


SEQ ID NO 


3629 


tggactcatctgctacagc 


3567 


3586SEQ ID NO: 


4970 gctgcagtggcccgttcca 


8167 


8186 


SEQ ID NO 


: 3630 


gctacagcttatggctcca 


3578 


3597SEQ ID NO: 


4971 tggaggacattcctctagc 


8211 


8230 


SEQ ID NO 


: 3631 


J -— -- _ I t J J t 

ggtggcatggcattatgat 


3610 


3629SEQ ID NO: 


4972atcacaaattagtttcacc 


8947 


8966 


SEQ ID NO 


: 3632 


agagaagattgaatttgaa 


3631 


3650SEQ ID NO: 


4973ttcaacgatacctgtctct 


7713 


7732 


SEQ ID NO 


: 3633 


caggcaccaatgtagatac 


3657 


3676SEQ ID NO: 


4974gtatgctaatagactcctg 


3736 


3755 


SEQ ID NO 


3634 


gacttccaatttccctgtg 


3685 


3704SEQ ID NO: 


4975 cacaatgcaaaattcagtc 


5195 


5214 


SEQ ID NO 


3635 


gtccctcaaacagacatga 


3764 


3783SEQ ID NO: 


4976tcataagggaggtagggac 


12777 


12796 


SEQ ID NO 


: 3636 


caaacagacatgactttcc 


3770 


3789SEQ ID NO: 


4977 ggaactacaatttcatttg 


7022 


7041 


SEQ ID NO. 


3637 


atagttgcaatgagclcat 


3809 


3828SEQ ID NO: 


4978 atgatttgaaaatagctat 


6693 


6712 


SEQ ID NO: 


3638 


gcttcagaaggcatctggg 


3829 


3848SEQ ID NO: 


4979cccaagaggtatttaaagc 


12957 


12976 


SEQ ID NO: 


3639 


ggagttcaacctccagaac 


3895 


3914SEQ ID NO: 


4980gttcactccaftaacctcc 


6314 


6333 


SEQ ID NO: 


3640 


agaaaacctcttcttaaaa 


3940 


3959SEQ ID NO: 


4981 ttttctaaatggaacttct 


12173 


12192 


SEQ ID NO: 


3641 


aaaacctcttcttaaaaag 


3942 


3961 SEQ ID NO: 


4982ctttgaaaaatlctctttt 


9213 


9232 


SEQ ID NO: 


3642 


aaaaagcgatggccgggtc 


3955 


3974SEQ ID NO: 


4983gaccttgcaagaatatttt 


6343 


6362 


SEQ ID NO: 


3643 


gtcaaatataccttgaaca 


3971 


3990SEQ ID NO: 


4984tgttaacaaattccttgac 


7363 


7382 


SEQ ID NO: 


3644 


tgaacaagaacagtttgaa 


3984 


4003SEQ ID NO: 


4985 ttcaagltcctgaccttca 


8310 


8329 


SEQ ID NO: 


3645 


agtttgaaaaltgagattc 


3995 


4014SEQ ID NO: 


4986 gaatctg gctccctcaact 


9047 


9066 


1 k I _ini_ 

SEQ ID NO: 


3646 


gtttgaaaattgagattcc 


3996 


4015SEQ ID NO: 


4987 ggaaataccaagtcaaaac 


10454 


10473 


SEQ ID NO: 


3647 


ttgaaaattg agattcctt 


3998 


4017SEQ ID NO: 


4988 aaggaaaagcgcacctcaa 


12030 


12049 


SEQ ID NO: 


3648 


ctaaagatgttagagactg 


4046 


4065SEQ ID NO: 


4989 cagttg accacaag cttag 


10545 


10564 


SEQ ID NO: 


3649 


atgttagagactgttagga 


4052 


4071 SEQ ID NO: 


4990 tccttaacaccltccacat 


8073 


8092 


SEQ ID NO: 


3650 


cagccctccacttcaagtc 


4074 


4093SEQ ID NO: 


4991 gacltctctagtcaggctg 


8813 


8832 


SEQ ID NO: 


3651 


ag ccctccacttcaagtct 


4075 


4094SEQ ID NO: 


4992 agacatcgctgggctggct 


5728 


5747 


SEQ ID NO: 


3652 


ccatctgccatctcgag ag 


4102 


4121 SEQ ID NO: 


4993 ctcf caaatgacatgatgg 


5330 


5349 


SEQ ID NO: 


3653 


attcccaagttgtatcaac 


4142 


4161 SEQ ID NO: 


4994 gttgagaagccccaagaat 


6254 


6273 


SEQ ID NO: 


3654 


tcaactgcaagtgcctctc 


4156 


4175SEQ ID NO: 


4995 gagatcaagacactgttga 


8843 


8862 


SEQ ID NO: 


3655 


ggtgttctagacctctcca 


4178 


41S^7sEQ ID NO: 


4996 tggaaccctctccctcacc 


4735 


4754 


SEQ ID NO: 


3656 


ctccacgaatgtctacagc 


4192 


4211 SEQ ID NO: 


4997 gctggtaacctaaaaggag 


5588 


5607 


SEQ ID NO: 


3657 


cacgaatgtctacagcaac 


4195 


4214SEQ ID NO: 


4998 gttgcccaccatcatcgtg 


11671 


11690 


SEQ ID NO: 


3658 


acgaatgtctacagcaact 


4196 


4215SEQ ID NO: 


4999 agttgcccaccatcatcgt 


11670 


11689 


SEQ ID NO: 


3659 


tcctacagtggtggcaaca 


4232 


4251 SEQ ID NO: 


SOOOtgttagttgctcttaagga 


13359 


13378 


SEQ ID NO: 


3660 


cgttaccacatgaaggctg 


4280 


4299 SEQ ID NO: 


5001 cagcaagtacctgagaacg 


8611 


8630 


SEQ ID NO: 


3661 


gaaggctgactctgtggtt 


4291 


4310SEQ ID NO: 


5002 aacctatgccttaatcttc 


13169 


13188 


SEQ ID NO: 


3662 


tgtggttgacctgctttcc 


4303 


4322SEQ ID NO: 


5003 ggaaagttaaaacaacaca 


6965 


6984 
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SEQ ID NO 


: 3663 


cctgctttcctaca atgtg 


4312 


4331 SEQ 


ID NO 


5004 cacaccttgacattgcagg 


11088 


11107 


1 3 


SEQ ID NO 


3664 


ctgctttcctacaatgtgc 


4313 


4332 SEQ 


ID NO 


5005 gcacaccttgacattgcag 


11087 


11106 


1 3 


SEQ ID NO 


3665 


tcctacaatgtgcaaggat 


4319 


4338 SEQ 


ID NO 


5006 atccgctggctctgaagga 


8577 


8596 


1 3 


SEQ ID NO 


3666 


tatgaocacaagaatacgt 


4352 


4371 SEQ 


ID NO 


5007 acgtccgtgtgccltcata 


9984 


10003 


1 3 


SEQ ID NO 


: 3667 


atgaccacaagaatacgtc 


4353 


4372 SEQ 


ID NO 


5008 gacgtccgtgtgccttcat 


9983 


10002 ' 


1 3 


SEQ ID NO 


: 3668 


gaatacgtctacactatca 


4363 


4382 SEQ 


ID NO 


: 5 009 tg attatctg aattcattc 


6487 


6506 


1 3 


SEQ ID NO 


: 3669 


tttctagattcgaatatca 


4406 


4425 SEQ 


ID NO 


: 5010 tgatttacatgatUgaaa 


6685 


6704 ' 


1 3 


SEQ ID NO 


3670 


g attcgaatatcaa attca 


4412 


4431 SEQ 


ID NO 


501 1 tgaagtagctgagaaaatc 


7102 


7121 


1 3 


SEQ ID NO 


: 3671 


gaaacaacccagtctcaaa 


4449 


4468 SEQ 


ID NO 


: 5012tttQaaaaattctcttttc 


9214 


9233 • 


1 3 


SEQ ID NO 


: 3672 


cccagtctcaaaaggttta 


4456 


4475SEQ 


ID NO 


; 5013taaattcattactcctggg 


11302 


11321 • 


1 3 


SEQ ID NO 


3673 


ctcaaaaggtttactaata 


4462 


4481 SEQ 


ID NO 


501 4tattcaaaactgagttgag 


12231 


12250 ' 


1 3 


SEQ ID NO 


3674 


tcaaaaggtttactaatat 


4463 


4482SEQ 


ID NO 


501 5 atattcaaaactgagttga 


12230 


12249 ' 


1 3 


SEQ ID NO 


3675 


aaaaggtttactaatattc 


4465 


4484SEQ 


ID NO 


501 6 gaatttgaaagttcgtttt 


9280 




1 3 


SEQ ID NO 


: 3676 


gaaacagcatttgtttgtc 


4543 


4562 SEQ 


ID NO 


5017gacagcatcttcgtgtttc 


11214 


1 1 233 ' 


1 3 


SEQ ID NO 


3677 


atttgtttgtcaa agaagt 


4551 


4570 SEQ 


ID NO 


501 8 acttaaaaaatataaaaat 


8022 


8041 ' 


1 3 


SEQ ID NO 


3678 


tcaagattgatgggcagtt 


4569 


4588SEQ 


ID NO 


50 1 9 aactctcaagtcaagttga 


13422 


13441 ' 


1 3 


SEQ ID NO 


3679 


ttcagagtctcttogttct 


4586 


4605 SEQ 


ID NO 


6020 agaagatggcaaatttgaa 


11995 


12014 ' 


1 3 


SEQ ID NO 


3680 


cag agtctcltcgttctat 


4588 


4607SEQ 


ID NO' 


6021 atagcatggacttcttctg 


8873 


8892 ' 


1 3 


SEQ ID NO 


3681 


atgctaaagg cacatatgg 


4605 


4624 SEQ 


ID NO 


5022 ccatttgagatcacggcat 


9245 


9264 - 


1 3 


SEQ ID NO 


3682 


gcacatatggcctgtcttg 


4614 


4633SEQ 


ID NO. 


5023 caagttggcaagtaagtgc 


9372 


9391 


1 3 


SEQ ID NO 


3683 


gagtccaacctgaggttta 


4667 


4686 SEQ 


ID NO, 


5024 taaagtgccacttttactc 


6190 


6209 ' 


1 3 


SEQ ID NO 


: 3684 


agtccaacctgaggtttaa 


4668 


4687 SEQ 


ID NO: 


5025ttaacagggaagatagact 


9308 


9327 ' 


1 3 


SEQ ID NO 


3685 


cctacctccaaggcaccaa 


4692 


4711 SEQ 


ID NO: 


5026ttggcaagtaagtgctagg 


9376 


9395 ' 


1 3 


SEQ ID NO 


3686 


gaagatggaaccctctccc 


4730 


4749SEQ 


ID NO: 


5027gggaagaagaggcagcttc 


12291 


12310 ' 


1 3 


SEQ ID NO 


3687 


tgatctgcaaagtggcatc 


4762 


4781 SEQ 


ID NO: 


5028 gatgaggaaactcagatca 


12263 


12282 ' 


1 3 


SEQ ID NO 


3688 


gatctgcaaagtggcatca 


4763 


4782SEQ 


ID NO. 


5029 tgatgaggaaactcag ate 


12262 


12281 ' 


1 3 


SEQ ID NO 


3689 


gcttccctaaagtatgaga 


4793 


4812SEQ 


ID NO: 


5030tctcgtgtctaggaaaagc 


5977 


5996 ' 


1 3 


SEQ ID NO 


3690 


gtatgagaactacgagctg 


4804 


4823 SEQ 


ID NO: 


5031 cagcttaagagacacatac 


6920 


6939 ' 


1 3 


SEQ ID NO 


3691 


tctaacaagatggatatga 


4868 


4887SEQ 


ID NO: 


5032tcattttccaactaataga 


13032 


13051 ' 


1 3 


SEQ ID NO 


3692 


ctgctgcgttctgaatatc 


4907 


4926SEQ 


ID NO: 


5033 gatacaagaaaaactgcag 


6901 


6920 ' 


1 3 


SEQ ID NO 


3693 


tcattga ggttcttcagcc 


4940 


4959SEQ 


ID NO: 


5034ggctcatatgctgaaatga 


5348 


5367 ' 


1 3 


SEQ ID NO: 


3694 


ttctggatcactaaattcc 


4963 


4982SEQ 


ID NO: 


5035 ggaaggacaaggcccagaa 


12549 


1 2568 ' 


1 3 


SEQ ID NO: 


3695 


ccatggtcttgagttaaat 


4981 


5000SEQ 


ID NO: 


5036 atttttattcctgccatgg 


10103 


10122 ' 


I 3 


SEQ ID NO: 


3696 


tcttaggcactgacaaaat 


5007 


5026SEQ 


ID NO: 


5037 attttttgcaagttaaaga 


14019 


14038 - 


I 3 


SEQ ID NO: 


3697 


acaaggcgacactaaggat 


5040 


5059SEQ 


ID NO: 


5038 atccatgatctacatttgt 


6794 


6813 ' 


I 3 


SEQ ID NO: 


3698 


tgcaacgaccaacttgaag 


5083 


5102SEQ 


ID NO: 


5039 cttcagggaacacaatgca 


5185 


5204 ' 


I 3 


SEQ ID NO: 


3699 


caacttgaagtgtagtctc 


5092 


51 11 SEQ 


ID NO: 


5040 gagatgagagatgccgttg 


6239 


6258 1 


I 3 


SEQ ID NO: 


3700 


gctggagaatgagctgaat 


5116 


5135SEQ 


ID NO: 


5041 attctcttttcttttcagc 


9222 


9241 1 


I 3 


SEQ ID NO: 


3701 


gcagagcttggcctdctg 


5135 


5154SEQ 


ID NO: 


5042 cagatacaagaaaaactgc 


6899 


6918 1 


I 3 


SEQ ID NO: 


3702 


tctctgg g g catctatg aa 


5148 


5167SEQ 


ID NO: 


5043ttcattcaattgggagaga 


6499 


6518 1 


I 3 


SEQ ID NO: 


3703 


tctggggcatctatgaaat 


5150 


5169SEQ 


ID NO: 


5044 atttgtaagaaaatacaga 


6436 


6455 1 


I 3 


SEQ ID NO: 


3704 


aacacaatgcaaaattcag 


5193 


5212SEQ 


ID NO: 


5045 ctgaagcattaaaactgtt 


7506 


7525 1 


I 3 


SEQ ID NO: 


3705 


ctcacagagctatcactgg 


5231 


5250 SEQ 


ID NO: 


5046 ccagatgctgaacagtgag 


8149 


8168 1 


I 3 


SEQ ID NO: 


3706 


tgggaagtgcttatcaggc 


5247 


5266SEQ 


ID NO: 


5047 g cctacgttccatgtccca 


11356 


11375 1 


I 3 


SEQ ID NO: 


3707 


ttcaaggtcagtcaagaag 


5303 


5322SEQ 


ID NO: 


5048 cttcagtgcagaatatgaa 


11977 


11996 ' 


I 3 


SEQ ID NO: 


3708 


aatgacatgatgggctcat 


5336 


5355SEQ 


ID NO: 


5049 atgattatctgaattcatt 


6486 


6505 ' 


I 3 


SEQ ID NO: 


3709 


gctcatatgctgaaatgaa 


5349 


5368SEQ 


ID NO: 


5050ttcagccattgacatgagc 


5746 


5765 ' 


I 3 


SEQ ID NO: 


3710 


atatgctgaaatgaaattt 


5353 


5372sEQ 


ID NO: 


5051 aaatagctattgctaatat 


6702 


6721 


I 3 


SEQ ID NO: 


3711 


tctgaacaltgcaggctta 


5386 


5405 SEQ 


ID NO. 


5052taagaaccagaagatcaga 


10996 


11016 " 


I 3 


SEQ ID NO: 


3712 


gaacattgcaggcttatca 


5389 


5408 SEQ 


ID NO: 


5053 tgatatcgacgtgaggttc 


12490 


12509 1 


I 3 
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SEQ ID NO 


: 3713 


tgcaggcttatcactggac 


5395 


6414SEQ ID NO 


: 5054 gtcctggattccacatgca 


11852 


11871 


1 3 


SEQ ID NO 


: 3714 


tcaaaacttgacaacattt 


5420 


5439 SEQ ID NO 


• 5055aaattccttgacatgttga 


7370 


7389 


1 3 


SEQ ID NO 


: 3715 


atttacagctctg acaagt 


5435 


5454SEQ ID NO 


: 5056acttaaaaaatataaaaat 


8022 


8041 


1 3 


SEQ ID NO 


: 3716 


ctctgacaagttttataag 


5443 


6462SEQ ID NO 


: 5067cttacttgaattccaagag 


10674 


10693 


1 3 


SEQ ID NO 


: 3717 


gttaatttacagctacagc 


5468 


5487SEQ ID NO 


: 5058 gctgcatgtggctggtaac 


5578 


5597 


1 3 


SEQ ID NO 


: 3718 


ttctctg gtaact acttta 


5491 


5510SEQIDNO 


; 5059taaaagattactttgagaa 


7275 


7294 


1 3 


SEQ ID NO 


: 3719 


cctaaaag g agcctaccaa 


5596 


5615SEQ ID WO 


: 5060ttggcaagtaagtgctagg 


9S70 


9395 


1 3 


SEQ ID WO 


: 3720 


aaaaggagcctaccaaaat 


5599 


5618SEQ ID NO 


: 5061 atttacaattgttgctttt 


6271 


6290 


1 3 


SEQ ID NO 


: 3721 


sggsgcctaccaaaalaat 


5602 


5621SEQIDNO 


: 5062attacctatgatttctcct 


10127 


10146 


1 3 


SEQ ID NO 


: 3722 


ataatgaaataaaacacat 


5616 


5635SEQ ID WO 


: 5063atgtcaaacactttgttat 


7065 


7084 


1 3 


SEQ ID NO 


: 3723 


aaaacacatctatgccatc 


^^^^ 

5626 


5645SEQ ID WO 


: 5064gatgaagatgacgactttt 


12158 


12177 


1 3 


SEQ ID WO 


: 3724 


tgctaaggttcagggtgtg 


5686 


5705SEQ ID NO 


: 6065cacaagtcgattcccagca 


9087 


9106 


1 3 


SEQ ID NO 


: 3725 


gagltlagccatcggctca 


5705 


5724SEQ ID NO 


: 5066tgaggtgactGagagactc 


7450 


7469 


1 3 


SEQ ID NO 


: 3726 


gctggcttcagccattgac 


5740 


5759SEQ ID NO 


: 5067gtcagtgaagttctccagc 


8596 


8615 


1 3 


SEQ ID NO 


: 3727 


atttcagcaatgtcttccg 


5790 


5809SEQ ID NO 


: 5068 cggagcalgggagtgaaat 


8628 


8647 


1 3 


SEQ ID NO 


: 3728 


i — ^ A _M_ J. a _M_ 

tttcagcaatgtcttccgt 


5791 


5810SEQIDNO 


5069acggagcatgggaglgaaa 


8627 


8646 


1 3 


SEQ ID NO 


: 3729 


ttcagcaatgtcttccgtt 


5792 


SS^ISEQIDNO 


5070aacggagcatgggagtgaa 


8626 


8645 


1 3 


SEQ ID NO 


3730 


cagcaatgtcttccgttct 


5794 


5813SEQ ID NO 


5071 agaagtgtcttcaaagctg 


12412 


12431 


1 3 


SEQ ID NO 


3731 


tgtcttccgttctgtaatg 


5800 


5819SEQ ID NO 


5072cattcaattgggagagaca 


6501 


6520 


1 3 


SEQ ID NO 


3732 


gtcttccgttctgtaatgg 


5801 


5820SEQ ID NO 


5073ccattcagtctctcaagac 


12975 


12994 • 


1 3 


SEQ ID NO: 


3733 


atgggaaactcgctctctg 


5859 


5878SEQ ID NO 


5074 cagataaaaaactcaccat 


12213 


12232 


1 3 


SEQ ID NO: 


3734 


ggagaacatactgggcagc 


5879 


5898SEQ ID NO 


5075 gctgttttgaagactctcc 


1088 


1107 


1 3 


SEQ ID NO: 


3735 


gttgaaagcagaacctctg 


5914 


5933SEQ ID NO 


5076 cagaattcataatcccaac 


8274 


8293 ' 


1 3 


SEQ ID NO: 


3736 


1.1. » m 

g tctag g aa aagcatcagt 


5983 


6002SEQ ID NO 


5077actgcaagatttttcagac 


13612 


13631 


1 3 


SEQ ID NO: 


3737 


agcatcagtgcagctcttg 


5993 


6012SEQ ID NO 


5078caagaacctgttagttgcl 


13351 


13370 ' 


1 3 


SEQ ID NO: 


3738 


ttgaacacaaagtcagtgc 


6009 


6028SEQ ID NO 


5079gcacatcaatattgatcaa 


6418 


6437 1 


I 3 


SEQ ID NO: 


3739 


gcagacaggcacctggaaa 


6046 


6065SEQ ID NO: 


5080tttcagatggcattgctgc 


11610 


11629 1 


1 3 


SEQ ID NO: 


3740 


gaaactcaagacccaattt 


6061 


6080SEQ ID NO: 


5081 aaatcccatccaggttttc 


8037 


8056 ' 


1 3 


SEQ ID NO: 


3741 


acaatgaatacagccagga 


6084 


Q'^O^sEQiDNO: 


5082 tcctttggctgtgctttgt 


9682 


9701 ' 


1 3 


SEQ ID NO: 


3742 


cttggatgcttacaacact 


6103 


S^22sEQ ID NO: 


5083 agtgaagttctccagcaag 


8599 


8618 1 


1 3 


SEQ ID NO: 


3743 


ttggcgtggagcttactgg 


6132 


6151 SEQ ID NO: 


5084 ccagaattcataatcccaa 


8273 


8292 1 


I 3 


SEQ ID NO: 


3744 


cacttttactcagtgagcc 


6198 


6217SEQ ID NO: 


5085ggctattgatgttagagtg 


6988 


7007 1 


1 3 


SEQ ID NO: 


3745 


tttagagatgagagatgcc 


6236 


6254SEQ ID NO: 


5086ggcatgatgctcatttaaa 


9177 


9196 1 


3 


SEQ ID NO: 


3745 


gagaagccccaagaattta 


6257 


o276sEQ ID NO: 


5087taaagccattcagtctctc 


12970 


12989 1 


3 


SEQ ID NO: 


3747 


caattgttgcttttgtaaa 


6276 


6295SEQ ID NO: 


5088 tttaaccagtcagatattg 


10187 


10206 1 


3 


SEQ ID NO: 


3748 


ttttgtaaagtatgataaa 


6286 


6305sEQ ID NO: 


5089tttattgctgaatccaaaa 


13655 


13674 1 


3 


SEQ ID NO: 


3749 


ttgtaaagtatg ataaaaa 


6288 


6307SEQ ID NO: 


5090 ttttgagaggaatcgacaa 


6358 


6377 1 


3 


SEQ ID NO: 


3750 


ttcactccattaacctccc 


6315 


6334SEQ ID NO: 


5091 gggaaaaaacaggcttgaa 


9576 


9595 1 


3 


SEQ ID NO: 


3751 


ttttgagaccttgcaagaa 


6337 


6356SEQ ID NO: 


5092ttctctctatgggaaaaaa 


9566 


9585 1 


3 


SEQ ID NO: 


3752 


accttgcaagaatattttg 


6344 


6363SEQ ID NO: 


5093 caaaagaagcccaagaggt 


12948 


12967 1 


3 


SEQ ID NO: 


3753 


tcaatattgatcaatttgt 


6423 


6442SEQ ID NO: 


5094acaaagcagattatgttga 


11829 


11848 1 


3 


SEQ ID NO: 


3754 


cagagcagccctgggaaaa 


6451 


6470SEQ ID NO: 


5095 ttttcagaccaactctctg 


13622 


13641 1 


3 


SEQ ID NO: 


3755 


cctgggaaaactcccacag 


6460 


6479SEQ ID NO: 


5096 ctgtctctggtcagccagg 


7724 


7743 1 


3 


SEQ ID NO: 


3756 


actcccacagcaagctaat 


6469 


6488 SEQ ID NO: 


5097 attacacttcctttcgagt 


12869 


12888 1 


3 


SEQ ID NO: 


3757 


aattcattcaattgggaga 


6497 


6516sEQ ID NO: 


5098tctcttcctccatggaatt 


10479 


10498 1 


3 


SEQ ID NO: 


3758 


ttcaattgggagagacaag 


6503 


6522SEQ !D NO: 


5099cttggagtgccagtttgaa 


11803 


11827 1 


3 


SEQ ID NO: 


3759 


sggagaaactgactgctct 


6534 


6553SEQ ID NO: 


5 1 00 ag agctlatgggatttcct 


11163 


11182 1 


3 


SEQ ID NO: 


3760 


actgactgctctcacaaaa 


6541 


6560SEQ ID NO: 


5101 ttttggcaagctatacagt 


8380 


8399 1 


3 


SEQ ID NO: 


3761 


gactgctctcacaaaaaag 


6544 


6563SEQ ID NO; 


51 02cmgtgagtttatcagtc 


9695 


9714 1 


3 


SEQ ID NO: 


3762 


cagacatatatgatacaat 


6641 


6660SEQ ID NO: 


51 03attggatatccaagatctg 


1933 


1952 1 


3 
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SEQ ID NO 


3763 


aatttgatcagtatattaa 


6657 


6676 SEQ ID NO: 


51 04ttaaaagaaatcttcaatt 


13815 


13834 


1 3 


SEQ ID NO 


3764 


tatgatttacatgatttga 


6683 


6702 SEQ ID NO: 


51 05tcaatgattatatcccata 


13128 


13147 


1 3 


SEQ ID NO 


: 3765 


tttg a aa ata 9 ctattgct 


6697 


6716SEQ ID NO: 


51 06agcacagaaaaaattcaaa 


13864 


13883 


1 3 


SEQ ID NO 


; 3766 


ttgaaaatagctattgcta 


6698 


6717SEQ ID NO: 


51 07tagcacagaaaaaattcaa 


13863 


13882 


1 3 


SEQ ID NO 


3767 


aatagctattgctaatatt 


6703 


6722 SEQ ID NO: 


5108 aataaatggagtctttatt 


14084 


14103 


1 3 


SEQ ID NO 


: 3768 


attattgatgaaatcattg 


6719 


6738 SEQ ID NO: 


51 09caataccagaattcataat 


8268 


8287 


1 3 


SEQ ID NO 


: 3769 


aaagtcttgatgagcacta 


6747 


6766 SEQ ID NO: 


51 1 0tagtgattacacttcxrttt 


12864 


12883 


1 3 


SEQ ID NO 


: 3770 


aagtcttgatgagcactat 


6748 


6767 SEQ ID NO: 


5111 atagcaacactaaatactt 


8769 


8788 


1 3 


SEQ ID NO 


: 3771 


ttgatgagcactatcatat 


6753 


6772SEQ ID NO: 


5112 atatccaagatgagatcaa 


13101 


13120 


1 3 


SEQ ID NO 


: 3772 


taattttagtaaaaacaat 


6777 


6796SEQID NO: 


5113 attgagattccctccatta 


11702 


11721 


1 3 


SEQ ID NO 


: 3773 


ttttagtaaaaacaatcca 


6780 


6799 SEQ ID NO: 


5114tggagtgccagtttgaaaa 


11810 


11829 


1 3 


SEQ ID NO 


: 3774 


acatttglttattgaaaat 


6805 


6824 SEQ ID NO: 


5115 atttcctaaagctggatgt 


11175 


11194 


1 3 


SEQ ID NO 


: 3775 


attgattttaacaaaagtg 


6824 


6843SEQ ID NO: 


51 16cactgttccagttgtcaat 


9871 


9890 


1 3 


SEQ ID NO 


: 3776 


attttaacaaaagtggaag 


6828 


6847 SEQ ID NO: 


51 1 7cttcaaagacttaaaaaat 


8014 


8033 


1 3 


SEQ ID NO 


: 3777 


aaatcagaatccagataca 


6888 


6907SEQ ID NO: 


51 1 8tgtaccataagccatattt 


10088 


10107 


1 3 


SEQ ID NO 


: 3778 


gaatccagatacaagaaaa 


6894 


6913SEQ ID NO: 


51 1 9ttttctaaacttgaaattc 


9065 


9084 


1 3 


SEQ ID NO 


: 3779 


ttaagagacacatacagaa 


6924 


6943SEQ ID NO: 


51 20ttcttaaacattcctttaa 


9491 


9510 


1 3 


SEQ ID NO 


3780 


atccagcacctagctgg aa 


6950 


6969 SEQ ID NO: 


5121 ttccaatttccctgtggat 


3688 


3707 


1 3 


SEQ ID NO 


: 3781 


tgagcatgtcaaacacttt 


7060 


7079 SEQ ID NO: 


51 22 aaagtgccacttttactca 


6191 


6210 


1 3 


SEQ ID NO 


: 3782 


gagcatgtcaaacactttg 


7061 


7080SEQ ID NO: 


5123caaatgacatgatgggctc 


5334 


5353 • 


1 3 


SEQ ID NO 


: 3783 


aaacactttgttataaatc 


7070 


7089SEQ ID NO: 


51 24gattatatcccatatgttt 


13133 


13152 


1 3 


SEQ ID NO 


3784 


tgagaaaatcaatgccttc 


7111 


7130SEQ ID NO: 


51 25gaaggaaaagcgcacctca 


12029 


12048 ' 


1 3 


SEQ ID NO 


3786 


tatgaagtagaccaacaaa 


7160 


7179SEQ ID NO: 


5126tttgtggagggtagtcata 


10331 


1 0350 - 


1 3 


SEQ ID NO 


: 3786 


aagtagaccaacaaatcca 


7164 


7183SEQ ID NO: 


51 27tggatgaagatgacgactt 


12156 


12175 - 


1 3 


SEQ ID NO 


3787 


aagttgaaggagactattc 


7223 


7242 SEQ ID NO: 


51 28 gaataccaatgctgaactt 


10168 


10187 ' 


I 3 


SEQ ID NO: 


3788 


acaagttaagataaaagat 


7264 


7283SEQ ID NO: 


5 1 29 atctaaattcagttcttgt 


11334 


11353 1 


1 3 


SEQ ID NO: 


3789 


aagataaaagattactttg 


7271 


7290SEQ ID NO: 


5130caaaatagaagggaatctt 


2077 


2096 1 


1 3 


SEQ ID NO: 


3790 


gattactttgagaaattag 


7280 


7299SEQ ID NO: 


5131 ctaaacttgaaattcaatc 


9069 


9088 1 


1 3 


SEQ ID NO: 


3791 


tgagaaattagttggattt 


7288 


7307SEQ ID NO: 


51 32aaatccgtgaggtgactca 


7443 


7462 ' 


1 3 


SEQ ID NO: 


3792 


aaattagttggatltattg 


7292 


731 1 SEQ ID NO: 


51 33caattttgagaatgaattt 


10419 


10438 1 


3 


SEQ ID NO: 


3793 


tggatttattgatgatgct 


7300 


7319SEQIDNO: 


51 34agcatgcctagtttctcca 


9953 


9972 1 


3 


SEQ ID NO: 


3794 


tcattgaagatgttaacaa 


7353 


7372SEQ ID NO: 


5135ttgtagatgaaaccaatga 


7422 


7441 1 


3 


SEQ ID NO: 


3795 


cattgaagatgttaacaaa 


7354 


7373SEQ ID NO: 


51 36tttgtagatgaaaccaatg 


7421 


7440 1 


3 


SEQ ID NO: 


3796 


attgaagatgttaacaaat 


7355 


7374SEQ ID NO: 


51 37atttaagtatgatttcaat 


10495 


10514 1 


3 


SEQ ID NO: 


3797 


ttgaagatgttaacaaatt 


7356 


7375SEQ ID NO: 


51 38aatttaagtatgatttcaa 


10494 


10513 1 


3 


SEQ ID NO: 


3798 


tgaagatgttaacaaattc 


7357 


7376SEQ ID NO: 


5139gaatttaagtatgatttca 


10493 


10512 1 


3 


SEQ ID NO: 


3799 


acatgttg ata a ag aaatt 


7380 


7399SEQ ID NO: 


51 40aattccctgaagttgatgt 


11487 


1 1 506 1 


3 


SEQ ID NO: 


3800 


1.1.1. _f.i fl 1 f 

tttgattaccaccagtttg 


7406 


7425SEQ ID NO: 


5141 caaattgaacatccccaaa 


8791 


8810 1 


3 


SEQ ID NO: 


3801 


caaaatccgtgaggtgact 


7441 


7460SEQ ID NO: 


5 1 42 ag tccccctaacag atttg 


7972 


7991 1 


3 


SEQ ID NO: 


3802 


aaaatccgtgaggtgactc 


7442 


7461 SEQ ID NO: 


51 43gagtgaaatgctgtttttt 


8638 


8657 1 


3 


SEQ ID NO: 


3803 


aggtgactcagagactcaa 


7452 


7471 SEQ ID NO: 


5144ttgatgatatctggaacct 


10731 


10750 1 


3 


SEQ ID NO: 


3804 


t mm m t 

gig aaattcag g ctctg g a 


7473 


7492SEQ ID NO: 


5 1 46tccaatctcctcttttcac 


8409 


8428 1 


3 


SEQ ID NO: 


3805 


gttgcagtgtatctggaaa 


7547 


7566SEQ ID NO: 


5146tttcaagcaaatgcacaac 


8540 


8559 1 


3 


SEQ ID NO: 


3806 


ttaagttcagcatctttgg 


7616 


7635SEQ ID NO: 


51 47ocaatgctgaactttttaa 


10173 


10192 1 


3 


SEQ ID NO: 


3807 


tgaaggccaaattccg aga 


7641 


7660SEQ ID NO: 


51 48tctcctttcttcatcttca 


10213 


10232 1 


3 


SEQ ID NO: 


3808 


aatgtatcaaatggacatt 


7684 


7703SEQ ID NO: 


51 49aatgaagtccggattcatt 


11021 


11040 1 


3 


SEQ ID NO: 


3809 


attcagcaggaacttcaac 


7700 


7719SEQ ID NO: 


51 50gttgagaagccccaagaat 


6254 


6273 1 


3 


SEQ ID NO: 


3810 


acctgtctctggtcagcca 


7722 


7741 SEQ ID NO: 


5151 tggcaagtaagtgctaggt 


9377 


9396 1 


3 


SEQ ID NO: 


3811 


cctgtctctggtcagccag 


7723 


7742SEQ ID NO: 


51 52ctggacttctctagtcagg 


6810 


8829 1 


3 


SEQ ID NO: 


3812 


ggtcagccaggtttatagc 


7732 


7751 SEQ ID NO: 


51 53gctaaaggagcagttgacc 


10535 


10554 1 


3 
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SEQ ID NO 


3813 


ccaggtttatagcacactt 


7738 


7757SEQ 


ID NO 


5 1 54 aagtccggattcattctgg 


11025 


11044 1 


3 


SEQ ID NO 


: 3814 


gtttatagcacacttgtca 


7742 


7761 SEQ 


ID NO 


: 5155tgacctgtccattcaaaac 


13681 


13700 1 


3 


SEQ ID NO 


3815 


a ctt g tcaccta catttct 


7753 


7772 SEQ 


ID NO 


: 51 56 agaaaaaggggattgaagt 


10283 


10302 1 


3 


SEQ 


ID NO 


3816 


ctgattggtggactcttgc 


7770 


7789SEQ 


ID NO 


: 5157gcaagttaaagaaaatcag 


14026 


14045 1 


3 


SEQ 


ID NO 


3817 


atgaaagcattggtagagc 


7847 


7866 SEQ 


ID NO 


5158 get catctcctttcttcat 


10208 


10227 1 


3 


SEQ 


ID NO 


: 381 8 


tgaaagcattggtagagca 


7848 


7867 SEQ 


ID NO 


; 5159 tgctcatctcctttcttca 


10207 


1 0226 1 


3 


SEQ 


ID NO 


3819 


gggttcactgttcctgaaa 


7868 


7887 SEQ 


ID NO 


; 5 1 60 tttcaccatagaaggaccc 


8959 


8978 1 


3 


SEQ 


ID NO 


3820 


tcaagaccatccttgggac 


7887 


7906sEQ 


ID NO 


: 6161 gtcccoctaacagatttga 


7973 


7992 1 


3 


SEQ 


ID NO 


: 3821 


ccttgggaccatgcctgcc 


7897 


791 6 SEQ 


ID NO 


5162ggcaccagggctcggaagg 


13978 


13997 1 


3 


SEQ 


ID NO 


: 3822 


ttcaggctcttcagaaagc 


7929 


7948 SEQ 


ID NO 


: 5163gc1tgaaggaattcttgaa 


9588 


9607 1 


3 


SEQ 


ID NO 


: 3823 


ttcagataaacttcaaaga 


8004 


8023 SEQ 


ID NO 


; 5164tcttcataagttcaatgaa 


13183 


1 3202 1 


3 


SEQ 


ID NO 


3824 


acttcaaagacltaaaaaa 


8013 


8032 SEQ 


ID NO 


; 6165ttttaacaaaagtggaagt 


6829 


6848 1 


3 


SEQ 


ID NO 


3825 


atcccatccaggttttcca 


8039 


8058 SEQ 


ID NO 


5166tggagaagcaaalctggat 


9472 


9491 1 


3 


SEQ 


ID NO 


3326 


g aatttaccatccttaaca 


8063 


8082 SEQ 


ID NO 


51 67tgttgaagtgtctccattc 


9889 


9908 1 


3 


SEQ 


ID NO 


3827 


cattccttcctttacaatt 


8089 


8108SEQ 


ID NO 


; 5168aattccaattttgagaatg 


10414 


10433 1 


3 


SEQ 


ID NO 


: 3828 


ttgaccagatgctgaacag 


8145 


8164SEQ 


ID NO 


; 51 69 ctgttgaaagatttatcaa 


12932 


12951 1 


3 


SEQ 


ID NO 


3829 


aatcaccctgccagacttc 


8233 


8252 SEQ 


ID NO 


51 70 gaagttctcaattttgatt 


8522 


8541 1 


3 


SEQ 


ID NO 


3830 


tgaccttcacataccagaa 


8320 


8339SEQ 


ID NO 


5171 ttcttctggaaaagggtca 


8884 


8903 1 


3 


SEQ 


ID NO 


: 3831 


ttccagcttccccacatct 


8339 


8358 SEQ 


ID NO 


51 72 agattctcagatgagggaa 


8921 


8940 1 


3 


SEQ 


ID NO 


3832 


a a gctatacagtattctg a 


8387 


8406 SEQ 


ID NO' 


51 73tcagatggcattgctgctt 


11612 


11631 1 


3 


SEQ 


ID NO 


3833 


attctgaaaatccaatctc 


8399 


841 8 SEQ 


ID NO: 


51 74 gagataaccgtgcctgaat 


11552 


11571 1 


3 


SEQ 


ID NO 


3834 


tttcacattagatgcaaat 


8422 


8441 SEQ 


ID NO: 


51 75 attttgaaaaaaacagaaa 


9738 


9757 1 


3 


SEQ 


ID NO 


3835 


caaatgctgacatagggaa 


8436 


8455 SEQ 


ID NO: 


5 1 76 ttccatcacaaatcctttg 


9670 


9689 1 


3 


SEQ 


ID NO 


3836 


gagagtccaaattagaagt 


8508 


8527 SEQ 


ID NO: 


5177 actttacttcccaactctc 


13410 


1 3429 1 


3 


SEQ 


ID NO 


3837 


agagtccaaattagaagtt 


8509 


8528QEQ 


ID NO: 


5178 aactttacttcccaactct 


13409 


13428 1 


3 


SEQ 


ID NO 


3838 


tctcaattttgattttcaa 


8527 


8546 SEQ 


ID NO: 


51 79ttgattcccttttttgaga 


11537 


11556 1 


3 


SEQ 


ID NO 


3839 


caattttgattttcaagca 


8530 


8549 SEQ 


ID NO 


51 SOtgctgaatccaaaagattg 


13660 


13679 1 


3 


SEQ 


ID NO 


3840 


aatgcacaactctcaaacc 


8549 


8568SEQ 


ID NO: 


51 8 1 ggtttatcaaggggccatt 


12460 


1 2479 1 


3 


SEQ 


ID NO 


3841 


agttctccagcaagtacct 


8604 


8623 SEQ 


ID NO: 


61 82 aggttccatcgtgcaaact 


11388 


11407 1 


3 


SEQ 


ID NO 


3842 


agtacctgagaacggagca 


8616 


8635SEQ 


ID NO: 


51 83 tgctccaggagaacttact 


13780 


13799 1 


3 


SEQ 


ID NO. 


3843 


tcaaacacagtggca agtt 


8678 


8697SEQ 


ID NO: 


51 84aactctcaagtcaagttga 


13422 


13441 1 


3 


SEQ 


ID NO: 


3844 


acaatcagcttaccctg ga 


8751 


8770 SEQ 


ID NO: 


51 85tccattctgaatatattgt 


13380 


1 3399 1 


3 


SEQ 


ID NO: 


3845 


ctggatagcaacactaaat 


8765 


8784 SEQ 


ID NO: 


51 86 attttctgaacttccccag 


12702 


12721 1 


3 


SEQ 


ID NO: 


3846 


ctgacctgcgcaacgagat 


8829 


8848 SEQ 


ID NO: 


51 87 atctgatgaggaaactcag 


12259 


12278 1 


3 


SEQ 


ID NO: 


3847 


agatgagggaacacatgaa 


8929 


8948SEQ 


ID NO: 


51 88ttcatgtccctagaaatct 


10038 


1 0057 1 


3 


SEQ 


ID NO: 


3848 


tcaacttttctaaacttga 


9060 


9079SEQ 


ID NO: 


51 89tcaaggataacgtgtttga 


12618 


1 2637 1 


3 


SEQ 


ID NO; 


3849 


ttctaaacttgaaattcaa 


9067 


9086SEQ 


ID NO: 


5190 ttgatgatgctgtcaagaa 


7308 


7327 1 


3 


SEQ 


ID NO: 


3860 


gaaattcaatcacaagtcg 


9077 


9096SEQ 


ID NO: 


5191 cgacgaagaaaataatttc 


13566 


13585 1 


3 


SEQ 


ID NO: 


3851 


cactgtttggagaagggaa 


9141 


9160SEQ 


ID NO: 


51 92ttccagaaagcagccagtg 


12506 


1 2525 1 


3 


SEQ 


ID NO: 


3852 


adgtttggagaagggaag 


9142 


9161 SEQ 


ID NO: 


51 93 cttccccaaagagaccagt 


2898 


2917 1 


3 


SEQ 


ID NO: 


3853 


aattctcttttcttttcag 


9221 


9240 SEQ 


ID NO: 


51 94ctgattactatgaaaaatt 


13638 


13657 1 


3 


SEQ 


ID NO: 


3854 


ttcttttcagcccagccat 


9230 


9249SEQ 


ID NO: 


51 95atggaaaagggaaagagaa 


13494 


13513 1 


3 


SEQ 


ID NO; 


3855 


tttgaaagttcgttttcca 


9283 


9302SEQ 


ID NO: 


51 96tggaagtgtcagtggcaaa 


10380 


1 0399 1 


3 


SEQ 


ID NO: 


3856 


cagggaagatagacttcct 


9312 


9331 SEQ 


ID NO: 


51 97ag9acctttcaaattcctg 


9848 


9867 1 


3 


SEQ 


ID NO: 


3857 


ataagtacaacx:aaaattt 


9405 


9424 SEQ 


ID NO: 


51 98 aaatcaggatctgagttat 


14038 


1 4057 1 


3 


SEQ 


ID NO: 


3858 


acaacgagaacattatgga 


9435 


9454 SEQ 


ID NO: 


51 99tccattctgaatatattgt 


13380 


13399 1 


3 


SEQ 


ID NO: 


3859 


aggaataaatggagaagca 


9463 


9482 SEQ 


ID NO: 


5200 tgctggaattgtcattcct 


11734 


11753 1 


3 


SEQ 


ID NO: 


3860 


agcaaatctggatttctta 


9478 


9497SEQ 


ID NO: 


5201 laagttctclgtacctgct 


11719 


11738 1 


3 


SEQ 


ID NO: 


3861 


tcctttaacaattcctg aa 


9502 


9521 SEQ 


ID NO: 


5202ttcaaaacgagcttcagga 


13206 


1 3225 1 


3 


SEQ 


ID NO: 


3862 


itta a caattcctg aaatg 


9505 


9524 SEQ 


ID NO: 


5203 catttgatttaagtgtaaa 


9621 


9640 1 


3 
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SEQ ID NO 


: 3863 


dUdOctct LiaalCclC33ClCC 


9534 9653SEQ1DNO 


5204 ggagacagcatattcgtgt 


11211 


11230 


1 3 


SEQ ID NO 


: 3864 




Of O^ftft 

9561 9580SEQIDNO 


: 5206tcccagaaaacctcttctt 


3936 


3955 


1 3 


SEQ ID NO 


: 3865 




9578 9597sEQIDNa 


: 5206 ccttttacaattcattttc 


13021 


13040 


1 3 


or"/^ ir-\ K 1 

SEQ ID NO 


: 3866 


o Q nr^ o ^Wft^^ft ft ft ^ 

iLgaayyaancugaaaa 


9590 9609SEQIDNO 


: 5207ttttgagaatgaatttcaa 


10422 


10441 


1 3 


SEQ ID NO 


: 3867 


igaa gg aaxicng aaaac 


ncn«i F\f^Ar^ 

9591 9610SEQIDNO 


: 5208 gttttggctgataaattca 


11291 


11310 


1 3 


SEQ ID NO 


: 3868 


agcicagiaiaagaaaaac 


9640 9659sEQ ID NO 


: 5209gtttgataagtacaaagct 


9805 


9824 


1 3 


obU ID NO 


: 3869 


if* siQ cif r^^^HH"*! a Q fi n r» Q 


fi"70n rt"7on 

9720 9739SEQ ID NO 


: 5210tgcctgagcagaccattga 


11688 


11707 


1 3 


ohQ ID NO 


: 38/0 


cl Ly cJcJcaLfdlcJacJa llaaUlt 


9/'89 9808SEQ ID NO 


: 5211a actttgcactatgttcat 


12762 


12781 


1 3 


obU lU NU 


: 3871 


a a f 1 rr't n ri a t ca r* o f^l" n <+ 
dd lLlj>v>lijy d LolLidLi|.g(l 


9859 9878SEQ ID NO 


: 5212aacacatgaatcacaaatt 


8938 


8957 


1 3 


SEQ ID NO 


: 3872 


ft f t/^ti^ a A tntti-« «3 

iiLfCag iigicaa igiig a 


9876 9895SEQIDNO 


I 5213tcaaaacgagcttcaggaa 


13207 


13226 


1 3 


SEQ ID NO 


: 3873 


aagigiciccaiicaccat 


9894 9913SEQ ID NO 


: 5214atgggaagtataagaactt 


4842 


4861 


1 3 


SEQ ID NO 


: 3874 


ft tft'^^ ft^^ft ftft4ft ft^^^^«^ 

gicagcaigcciagtttci 


C\ ft ft ft 

9950 9969SEQIDNO 


: 5215agaaaaggcacaccttgac 


11080 


11099 


1 3 


SEQ ID NO 


: 3875 


ctgccaigg gcaatattac 


10113 10132SEQIDNO 


: 521 6 gtaagaaaatacagagcag 


6440 


6459 


1 3 


SEQ ID NO 


: 3876 


igaaiaccaaigcigaact 


10167 10186SEQ ID NO 


; 521 7agttgaaggagactattca 


7224 


7243 


1 3 


o^y^ 1 n\ ki/~\ 

SEQ ID NO 


3877 


f ^l*fn44rYol'fta'l'fttftft-ll> 

laiiyTigciCaTCicctl 


10201 10220SEQ ID NO 


5218aaggaaacataaactaata 


12889 


12908 


1 3 


SEQ ID NO 


: 3878 


ig tigcicaicicctnct 


H ft A H ft ft ft ft 

10204 10223SEQ ID NO 


; 5219agaagaaatctgcagaaca 


12431 


12450 


1 3 


SEQ ID NO 


: 3879 


icigtcangatgcacigc 


10232 10251 SEQ ID NO 


: 5220gcagtagactataagcaga 


13928 


1 3947 


1 3 


oEQ ID NO 


: 3880 


ccacagcicigtctcigag 


103051 0324550 ID NO 


6221 ctcagggatctgaaggtgg 


8195 


8214 


1 3 


obU ID NO 


3881 


axixgiggagggiagicat 


H noon H no^n 

10330 10349SEQ ID NO 


5222atgaagtagaccaacaaat 


7161 


7180 


1 3 


oEQ ID NO 


^\ ft ft ft 

3882 


aiaiggaagigicagtggc 


10377 10396SEQ ID NO 


5223 gccacactccaacg catat 


10778 


10797 


1 3 


otU lU NO 


3883 


Tn nc4 QOTQftft^ Qr*i4'ft ft ft ft ft 

Lyyaddiaccaagicaaaa 


10453 10472SEQ ID NO 


5224ttttacaattcattttcca 


13023 


13042 


1 3 


otU ID NO. 


3884 


O O iHT/^ Oft/^t !:aft+r^+^+ 

ddL) ILfCtclaaCClaCIQICI 


10463 10482SEQ ID NO 


5225 agacctagtgattacactt 


12859 


12878 ' 


1 3 


oEQ ID NO; 


ft ft ft ^ 

3885 


ft ftf nTfttft ftft^ft ft ft M 

ac/ig Lcicucciccaig g 


10475 10494SEQ ID NO 


5226ccalgcaagtcagcccagt 


10924 


1 0943 ' 


1 3 


SEQ ID NO: 


3886 


cucciccaigg aa ittaa 


10482 10501 SEQ ID NO 


5227ttaatcgaga9gtatgaag 


7148 


7167 ' 


1 3 


oEQ ID NO; 


O O ft ^ 

3887 


d I ici ica aig cigiacic 


10512 10531 SEQ ID NO 


5228gagttgagggtccgggaat 


12242 


12261 


I 3 


otU ID NO! 


O O ft 

3888 


LigduudUddyciidgcii 


10548 10567SEQ ID NO: 


5231 aagcgcacctcaatatcaa 


12036 


12055 ' 


1 3 


obU lU NO: 


3889 


v^ClCo ICllaCll llCC 


10573 10592SEQ ID NO: 


5232ggaactattgctagtgagg 


10649 


1 0668 1 


1 3 


bbU ID NO; 


o ft Oft 

3890 


agcigcagggcaciiccaa 


10710 10729SEQ ID NO: 


5233ttgggaagaagaggcagct 


12289 


12308 1 


3 


obQ ID NO: 


ft ft ft A 

3891 


iiccaaaaiigaigaTaic 


10723 10742SEQ ID NO: 


5234gatatacactagggaggaa 


12745 


12764 1 


3 


obU ID NO; 


ft O ft ft 

3892 


y aydaL.diaCdagcaaagc 


H noon 'ino"7n 

10860 10879sEQ jQ NO: 


5235 gcttggttttgccagtctc 


2467 


2486 1 


3 


obU lU NO; 


3893 


dLyyudddiyLOdgciCll 


1 0897 1091 6sEQ |d NO: 


5236 aagaggtatttaaagccat 


12960 


12979 1 


3 


obU iU NO: 


3894 


tn n £1 at nt r>onr^/«l-lrv 

lyywddaigiuagciciig 


10898 10917SEQ ID NO: 


5237 caagaggtatttaaagcca 


12959 


12978 1 


3 


ceo ir^ Kir\> 
obU IU INO; 


3895 


t+ntfoo n ri ff^i^ a tn ^ o o /t 
iiy iifi^dy y iLrUaigcaay 


10914 10933SEQ ID NO: 


^ft ft ft iJL 

5238cttgggggaggaggaacaa 


14066 


14085 1 


3 


otU iU NO: 


3896 


ly I LL>dy y iL>L>diy Cddgi 


10915 10934SEQ ID NO: 


5239 acttgggggaggaggaaca 


14065 


14084 1 


3 


obQ ID NO: 


ft ft f\ ^9 

3897 


dguccxiccaigauicc 


10940 10959SEQ ID NO: 


5240 ggaatctgatgaggaaact 


12256 


12275 1 


3 


oEQ ID NO; 


ft ft ft 

3898 


TOOTO ^^ftOftTftft ^^ft ftftftft 

igc^iaacaciaagaaccag 


1 0987 1 1 006sEQ ID NO: 


5241 ctggatgtaaccaccagca 


11186 


11205 1 


3 


SEQ ID NO: 


3899 


acLaagaaccagaagatca 


10994 11013SEQ |D NO: 


5242tgatcaagaacctgttagt 


13347 


13366 1 


3 


obQ ID NO: 


ft ft ft ft 

3900 


cxaagaaccagaagatcag 


H nnnc •< jt 

10995 11014SEQ ID NO: 


5243 ctgatcaagaacctgttag 


13346 


1 3365 1 


3 


obU ID NO: 


3901 


OQrt QOrt O^ft ft /*i ftf /*ift ft ft •'^ 

udgaagaicagaiggaaaa 


11003 11 022 SEQ ID NO: 


5244ttttcagaccaactctctg 


13622 


13641 1 


3 


oEQ ID NO: 


ft ^A^\ft 

3902 


aaaaaiyaagiccgganc 


11018 11037SEQ ID NO: 


5245 gaatttgaaagttcgtttt 


9280 


9299 1 


3 


oEQ ID NO: 


3903 


o 'O f"ffta'l"f'ft4^v ^v^v ^ n^M^A 

gancaiicigggictlTc 


11032 11 051 SEQ ID NO: 


5246 gaaaacctatgccttaatc 


13166 


13185 1 


3 


obU ID NO: 


3904 


ddgdaaugguacaccTiga 


11079 11098SEQ ID NO: 


5247 tea aaacctactgtctctt 


10466 


10485 1 


3 


SEQ ID NO: 


3905 


aaggacacctaaggttcct 


1111511134SEQ ID NO: 


5248 aggacaccaaaataacctt 


7572 


7691 1 


3 


SEQ ID NO: 


3906 


ccagcattggtaggagaca 


11199 11218SEQ ID NO: 


5249 tgtcaacaagtaccactgg 


12370 


12389 1 


3 


SEQ ID NO: 


3907 


ctttgtgtacaccaaaaac 


11239 11258SEQ ID NO: 


5250 gtttttaaattgttgaaag 


13148 


13167 1 


3 


SEQ ID NO: 


3908 


ccatccctgtaaaagtttt 


1127711296SEQ ID NO: 


5251 aaaagggtcatggaaatgg 


8893 


8912 1 


3 


SEQ ID NO: 


3909 


tgatctaaattcagttctt 


11332 11351 SEQ ID NO: 


5252 aagatagtcagtctgatca 


13334 


13353 1 


3 


SEQ ID NO: 


3910 


aagaagctgagaacttcat 


1143211451 SEQ ID NO: 


5253 atgagatcaacacaatctt 


13110 


13129 1 


3 


SEQ ID NO: 


3911 


tttgccctcaacctaccaa 


1145311472SEQ |D NO: 


5254 ttggtacgagttactcaaa 


12641 


12660 1 


3 


SEQ ID NO: 


3912 


cttgattcccttttttgag 


1153611555SEQ id NO: 


5255 ctcaattttgattttcaag 


8528 


8547 1 


3 
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SEQ ID NO 


: 3913 


ttcacgcttccaaaaagtg 


11591 lieiOsEQ 


ID NO 


5256cactcattgattttctgaa 


12693 


12712 


1 


3 


SEQ ID NO 


: 3914 


tgtttcagatggcattgct 


1160811627SEQ 


ID NO 


: 5257agcagattatgttgaaaca 


11833 


11852 


1 


3 


SEQ ID NO 


: 3915 


aatgcagtagccaacaaga 


11639 11 658sEQ 


ID NO 


; 5258tcttttcagcccagccatt 


9231 


9250 


1 


3 


SEQ ID NO 


: 3916 


ctgagcagaccattgagat 


11691 11710SEQ 


ID NO 


: 5259atctgatgaggaaactcag 


12259 


12278 


1 


3 


SEQ ID NO 


: 3917 


tgagcagaccattgagatt 


1169211711SEQ 


ID NO 


: 5260aatctgatgaggaaactca 


12268 


12277 


1 


3 


SEQ ID NO 


: 3918 


ttgagattccctccattaa 


11703 11 722SEQ 


ID NO 


: 5261 ttaatcttcataagttcaa 


13179 


13198 


1 


3 


SEQ ID NO 


391 9 


acttggagtgccagtttga 


11807 11 826sEQ 


ID NO 


; 5262tcaattgggagagacaagt 


6504 


6523 


1 


3 


SEQ ID NO 


: 3920 


caaatttgaaggacttcag 


12004 120233EQ 


ID NO 


- 5263ctgagaacttcatcatttg 


11438 


11457 


1 


3 


SEQ ID NO 


: 3921 


agcccagcgttcaccgatc 


12056 12075SEQ 


ID NO 


: 5264 gatccaagtatagttggct 


13286 


13306 


1 


3 


SEQ ID NO 


: 3922 


cagcgttcaccgatclcca 


12060 12079 SEQ 


ID NO 


: 5265tggacctgcaccaaagctg 


13960 




1 


3 


SEQ ID NO 


: 3923 


ctccatctgcg ctaccaga 


12074 12093SEQ 


ID NO 


; 5266tctgatatacatcacggag 


13711 


13730 


1 


3 


SEQ ID NO 


: 3924 


atgaggaaactcagatcaa 


12264 12283sEQ 


ID NO 


; 5267ttgagttgcccaccalcat 


11667 


11686 


1 


3 


SEQ ID NO 


3926 


aggcagcttctggcttgct 


12300 1231 9sEQ 


ID NO 


5268 agcaagtctttcctggcct 


3018 


3037 


1 


3 


SEQ ID NO 


3926 


tgaaagacaacgtgcccaa 


12327 12346 SEQ 


ID NO 


; 5269ttgggagagacaagtttca 


6508 


6527 


1 


3 


SEQ ID NO 


: 3927 


tatgattatgtcaacaagt 


12362 12381 SEQ 


ID NO 


; 5270actttgcactatgttcata 


12763 


12782 


1 


3 


SEQ ID NO 


3928 


cattaggcaaattgatgat 


12475 12494SEQ 


ID NO 


; 6271 atcaacacaatcttcaatg 


13115 


13134 


1 


3 


SEQ ID NO. 


3929 


ttgactcaggaaggccaag 


12584 12603SEQ 


ID NO 


5272cttggtacgagttactcaa 


12640 


12659 


1 


3 


SEQ ID NO: 


3930 


gaaacctgggatatacact 


12736 12755sEQ 


ID NO: 


5273agtgattacacttcctttc 


12865 


12884 


1 


3 


SEQ ID NO: 


3931 


tcctttcgagttaaggaaa 


12877 12896SEQ 


ID NO: 


5274tttctgccactg ctcag g a 


13524 


13543 


1 


3 


SEQ ID NO: 


3932 


gccattcagtctctcaaga 


12974 12993SEQ 


ID NO: 


5275tcttccgttctgtaatggc 


5802 


5821 


1 


3 


SEQ ID NO: 


3933 


gtgctacgtaatcttcagg 


13001 13020SEQ 


ID NO: 


5275 cctgcaccaaagctggcac 


13964 


13983 


1 


3 


SEQ ID NO: 


3934 


agctgaaagagatgaaatt 


13065 13084SEQ 


ID NO: 


5277 aatttattcaaaacgagct 


13200 


13219 


1 


3 


SEQ ID NO: 


3935 


aatttacttatcttattaa 


13080 13099 SEQ 


ID NO: 


5278 ttaaaagaaatcttcaatt 


13815 


13834 


1 


3 


SEQ ID NO: 


3936 


ttttaaattgttgaaagaa 


13150 13169sEQ 


ID NO: 


5279ttctctctatgggaaaaaa 


9566 


9585 


1 


3 


SEQ ID NO: 


3937 


taatcttcataagttcaat 


13180 13199SEQ 


ID NO: 


5280 attgagattccctccatta 


11702 


11721 


1 


3 


SEQ ID NO: 


3938 


atattttgatccaagtata 


13279 13298 SEQ 


ID NO: 


5281 tataagcagaagcacatat 


13937 


13956 


1 


3 


SEQ ID NO: 


3939 


tgaaatattatgaacttga 


13311 13330SEQ 


ID NO: 


5282tcaaccttaatgattttca 


8295 


8314 


1 


3 


SEQ ID NO: 


3940 


caatttctgcacagaaata 


13442 13461 SEQ 


ID NO: 


5283 tattcttcttttccaattg 


13834 


13853 


1 


3 


SEQ ID NO: 


3941 


agaagattgcagagctttc 


13509 13528SEQ 


ID NO: 


5284 gaaatcttcaatttattct 


13821 


13840 


1 


3 


SEQ ID NO: 


3942 


gaagaaaataatttctgat 


13570 13589SEQ 


ID NO: 


5285 atcagttcagataaacttc 


7999 


8018 


1 


3 


SEQ ID NO: 


3943 


ttgacctgtccattcaaaa 


13680 13699 SEQ 


DNO: 


5286ttttgagaatgaatttcaa 


10422 


10441 


1 


3 


SEQ ID NO: 


3944 


tcaaaactaccacacattt 


136931 371 2sEQ 


D NO: 


5287 aaattccttgacatgttga 


7370 


7389 


1 


3 


SEQ ID NO: 


3945 


ttttttaaaag aaatcttc 


13811 13830SEQ 


D NO: 


5288 gaagtgtcagtggcaaaaa 


10382 


10401 


1 


3 


SEQ ID NO: 


3946 


aggatctgagttattttgc 


14043 14062SEQ 


DNO: 


5289 gcaagggttcactgttcct 


7864 


7883 


1 


3 


SEQ ID NO: 


3947 


tttgctaaacttgggggag 


14057 14076 SEQ 


DNO: 


529 0 ctccccagg acctttcaaa 


9842 


9861 


1 


3 
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Table 10. Selected palindromic sequences from human gluco Re-6-phosDhatase 



SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 

SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 

SEQ 
SEQ 



ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 

ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 

ID NO: 
ID NO: 



Source 

5291 tccatcttcaggaagctgt 

5292 Gcatcttcaggaagctgtg 

5293 cctctggccatgccatggg 
5294ctctggccatgccatgggc 
5295ttgaatgtcattttgtggt 
5296 tcagtaatgggggaccagc 
5297ttttactgtgcatacatgt 
5298tgaggtgccaaggaaalga 
5299 gaggtgccaaggaaatgag 
5300 gggaaagataaagccgacc 

5301 ttttcctcatcaagttgtt 

5302 ctttcagccacatccacag 
5303tggactctggagaaagcGc 
5304 agcctcctcaagaacctgg 

530599^^*9999^*99^*^*^ 
5306 gagctcactcccactggaa 
5307 agctaatgaagctattgag 
5308gctaatgaagGtattgaga 
5309 ctaaatggctttaattata 
5310ctgcttttctttttttttG 
531 1 caatGacGaGcaagGGtgg 
531 2agcctggaataactgGaag 
531 3gttcGatGttcaggaagct 
53 1 4 tggtg ggttttgg atactg 
531 5aGGtgtgagactggaccag 

53 1 6 gctgrtacagaaactttca 

53 1 7 acagcatctataatgccag 
5318gggtgtagacctcctgtgg 
531 9 ggtgtagacGtcctgtgga 
5320 gtgtagacctcctgtggac 

5321 gacctcGtgtggactGtgg 

5322 GGtgggcacgctctttggG 
5323 ctgggGacgctctttggcc 
5324 ctggtcttctaGgtcttgt 
5325agagtgcggtagtgccGct 
5326 tgggcactggtatttggag 
5327 gaattaaatcacggatggc 
5328tgttgGtagaagttgggtt 
5329 aggagctctgaatctgata 
5330 taaatggctttaattatat 



Start End 



222 241 SEQ ID WO 

223 242 SEQ ID NO 

417 436SEQIDNO 

418 437SEQIDWO 
521 540SEQ ID NO 

1886 1905SEQIDNO 

1956 1975SEQIDNO 

50 69SEQ ID NO 

51 70SEQ ID NO 
487 506 SEQ ID NO 
598 617SEQIDNO 
651 670SEQ ID NO 
776 795SEQ ID NO 
848 867SEQ ID NO 
878 897sEQ ID NO 

1439 1458SEQIDNO 

1572 1591 SEQ ID NO 

1573 1592SEQIDNO 

1854 1873SEQIDNO 
2509 2528SEQIDNO 

0 19SEQIDNO 

12 31 SEQ ID NO 

220 239SEQ ID NO 

326 345SEQIDNO 

392 41 1 SEQ ID NO 

638 657 SEQ ID NO 

666 685SEQ ID NO 

760 779SEQIDNO 

761 780SEQ ID NO 

762 781 SEQ ID NO 
767 786 SEQ ID NO 

862 881 SEQ ID NO 

863 882SEQ ID NO 
1028 1047SEQIDNO 
1056 1075SEQIDNO 
1217 1236SEQ1DNO 
1267 1286SEQIDNO 
1598 1617SEQIDNO 
1764 1783SEQIDNO 

1855 1874SEQIDNO 
533laaaatgaGaaggggagggc 2215 2234SEQ ID NO 
5332ttaaaggaaaagtcaacat 2330 2349SEQ ID NO 
5333aGatcttctctcttttttt 2346 2364SEQ ID NO 
5334ttctacgtcctcttCGGGa 1 97 21 6seQ ID NO 
5335tgggtagGtgtgattggag ^ 257 276SEQ ID NO 



Match 

5369 acagactctttcagatgga 
5370 cacagactctttcagatgg 
5371 cccattttgaggccagagg 
5372gccGattttgaggccagag 
5373 accatacattatcattcaa 
5374gctggtctcgaactcctga 
5375 acatotttgaaaagaaaaa 
5376tcatgtctcagcctcctca 
5377ctcatgtctcagcctGGtc 
5378ggtcgcctggcttattccc 
5379aaQatGtttgaaaagaaaa 
5380ctgtggactctggagaaag 
5381 gggclggctctcaactcca 
5382 ccagattcttccaclggct 
5383tgagGGaccgcaccgggGG 

5384ttccaggtagggccagctc 
5385ctGagcctcctcagtagct 
5386tctGagcGtCGtcagtagG 
5387 tatatttttag aattttag 

5388 gaaaaatatatatgtgcag 

5389 cGagaatgggtccacattg 
5390 cttggatttctgaatggct 
5391 agctcactcccactggaac 
5392cagtGGtcGGaGCGtaGGa 
5393ctggagaaagGccagaggt 
5394 tg aatggtcttctgccagc 
5395GtgggtgtagacctGctgt 

539 6 cca cattg acaccacaccc 

5397 tcGacattgaGaccacacc 
5398 gtccacattgacaccacac 

5399 ccagatattgcactaggtc 

5400 gccagctcacaagcccagg 

5401 ggGcagctcacaagcGcag 
5402 acaaaagcaagacttccag 

5403 agggccaggattcctctct 

5404 ctGccactggaacagccGa 
5405gcGaaccaagagcacattc 
5406 aaccatcctgctcataaca 
5407tatcacattacatcatcct 
5408atatatgtgcagtatttta 
5409 gccctccttgcctgttttt 
541 Oatgtgcagtattttattaa 
541 1 aaaagaaaaatatatatgi 
541 2tgggccagccgcacaagaa 
541 Sctcccactggaacagccca 



Start End 
Indei^ Index 

1340 1359 

1339 1358 

1492 1511 

1491 1510 

2945 2964 

2731 2750 

2983 3002 

2620 2639 

2619 2638 

1295 1314 

2982 3001 
773 792 
884 903 

2107 2126 

2801 2820 

1676 1695 
2626 2645 
2625 2644 
2683 2702 
2996 3015 
812 831 
1987 2006 
1440 1459 
2425 2444 
782 801 
1474 1493 

758 777 

823 842 

822 841 

821 840 

2014 2033 

1687 1706 

1686 1705 

1663 1682 

2229 2248 

1446 1465 

2311 2330 

2967 2986 

2063 2082 

3003 3022 

2817 2836 

3007 3026 

2992 3011 

1116 1135 

1446 1465 



# B 

1 6 

1 6 

1 6 

1 6 

1 6 

1 6 



1 
1 
1 
1 
1 
1 
1 
1 
1 

1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 



6 
5 
5 
5 
5 
5 
5 
5 
5 

5 
5 
5 
5 
5 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 
4 



1 4 

1 4 

1 4 

1 4 

1 3 

1 3 
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SEQ ID NO 
SEQ ID NO 

SEQ !D NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 

SEQ ID NO 
SEQ ID NO 

SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 



5336 gctgtgattggagactggc 263 

5337 cacttccgtgcccctgata 358 

5338 acatctactctttccatct 464 

5339 ctactctttccatctttca 468 
5340agataaagccgacctacag 492 

5341 tgtgcagctgaatgtctgt 553 

5342 atgtctgtctgtcacgaat 564 
5343ctgtcacgaatctaccttg 572 
5344 atcaagttgttgctggagt 606 
5345 cagaaactttcagccacat 645 
5346 actttcagccacatccaca 650 
5347atgccagcctcaagaaata 678 
5348 agaaatattttctcattac 690 
5349 gaaatattttctcattacc 69 1 
5350^gctgctcaagggactggg 744 
5351 ex;tgtggactctggagaaa 772 

535299393339^^^9^99*99 '^^^ 

5353ttgaaacccccatcccaag 1 004 

5354cagatggaggtgccatatc 1351 

5355ggagctcactcccactgga 1 438 

5356ttgggtaatgtttttgaaa 1 553 

5357gaagttgggttgttctgga 1606 

5358aaaagaaggctgcctaagg 1 785 

5359aaagaaggctgcctaagga 1 786 

5360aagaaggctgcctaaggag 1 787 

536 1 agaaggctgcctaaggagg 1 788 

5362 atttccttggatttctgaa 1 982 
5363 tccttataagcccagctct 208 1 
5364 ataagcccagctctgcttt 2086 
53Q5ggccaggattcctctctca 2231 
5366 gccaactcctccttgcctg 2493 
5367ttttttttctttttttgag 25 1 9 
5368ccggcgtgcaccaccatgc 2652 



282SEQ ID NO 
377SEQ ID NO 
483SEQ ID NO 

487SEQ ID NO 
511 SEQ ID NO 
572SEQ ID NO 
583SEQ ID NO 
591 SEQ ID NO 
625SEQ ID NO 
664SEQ ID NO 
669SEQ ID NO 
697SEQ ID NO 
709SEQ ID NO 
710SEQ ID NO 

763sEQ ID NO 
791 SEQ ID NO 

803SEQ ID NO 
1023 SEQ ID NO 
1370SEQ ID NO 
1457SEQ ID NO 
1572 SEQ ID NO 
1625SEQ ID NO 
1804SEQ ID NO 
1805SEQ ID NO 
1806SEQ ID NO 
1807 SEQ ID NO 
2001 SEQ ID NO 
2100SEQ ID NO 
2105SEQ ID NO 
2250 SEQ ID NO 
2512SEQ ID NO 
2538SEQ ID NO 
2671 SEQ ID NO 



1 3 

1 3 

1 3 

1 3 

1 3 

1 3 

1 3 



5414gccatgccatgggcacagc 423 442 1 3 

541 5tatcacccaggctggagtg 2548 2567 1 3 

5416agatgggatttcatcatgt 2705 2724 1 3 

5417tgaatactctcacaagtag 1419 1438 1 3 

5418ctgtttttcaatci:catct 2828 2847 

5419acagaaactttcagccaca 644 663 

5420attcaggtatagctgacat 2038 2057 

5421 caaggtgctaggattacag 2779 2798 

5422actcctgacctcaagtgat 2742 2761 

5423atgtttcaattaggctctg 2185 2204 

5424tgtggcgtatcatgcaagt 1818 1837 

5425tattttttttactgtgcat 1950 1969 1 3 

5426gtaaatatgactcctttct 2283 2302 1 3 

5427ggtaaatatgactcctttc 2282 2301 1 3 

5428cccaagccaaccaagagca 2306 2325 1 3 

5429tttcatcatgttggccagg 2713 2732 1 3 

5430ccaccgcaccgggccctcc 2805 2824 1 3 

5431 cttgaattcctgggctcaa . 2405 2424 1 3 

5432gatatgcagagtatttctg 2847 2866 1 3 

5433tccacctgccttggcctcc ■ 2760 2779 1 3 

5434tttctctatcccaagccaa 2297 2316 1 3 

5435tccaccccactggatcttc 2131 2150 1 3 

5436ccttgcctgcttttctttt 2503 2522 1 3 

5437tccttgcctgcttttcttt 2502 2521 1 3 

5438ctccttgcctgcttttctt 2501 2520 1 3 

5439cctccttgcctgcttttct 2500 2519 1 3 

5440ttcaattaggctctgaaat 2189 2208 1 3 

5441 agagcacattcttaaagga 2319 2338 1 3 

5442aaagctgaagcctatttat 2889 2908 1 3 

5443tgagccaccgcaccgggcc 2801 2820 1 3 

5444caggctggagtggagtggc 2555 2574 1 3 

5445ctcataacatctttgaaaa 2977 2996 1 3 

5446gcatgagccaccgcaccgg 2798 2817 1 3 
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Table 11. Selected palindromic sequences from rat glucose-6-phosphatase 





Source 


Start End 
Index Index 


iviatcn 


Start End 
IndeK Index 


M. 

if 


O 
D 


SEQ ID NO. 




301 


320SEQ ID NO- 




598 


617 


1 

1 


6 


SEQ ID NO. 


544o oicuggggugggguiy g 


831 


850SEQ ID NO° 


^4.7^ rr'annpifnl Tipr'npRr^irirTia 


659 


678 


1 

1 


6 


SEQ ID NO: 


igccioayy ay aocigcgus 


879 


898SEQ ID NO* 




1019 


1038 


i 


6 


SEQ ID NO: 


5450 ccicyggccaigccaiggg 


376 


395SEQ ID NO" 




1171 


1190 


1 

1 


5 


SEQ ID NO: 




1478 


1497SEQ ID NO* 


't14 7 tfn p a n 3 n tn i n t nttna R 


2057 


2076 


1 

t 


5 


SEQ ID NO: 


54b^ Cagcuccigsggicioocia 


2 


21 SEQ ID NO: 


v»*+/ \j ity y ly n^iyiyai^y^ty 


123 


142 


1 

1 


4 

• 


SEQ ID NO: 


5453 ggiacGaaggaggaaggai 


13 


32SEQ ID NO: 


^ d. 7 7 at p pfi n tpn p t fsa rttanc 


66 


85 


1 


4 


SEQ ID NO: 


54 54 ciccacgaciiigggcaico 


51 


70sEO ID NO- 




1448 


1467 


1 

1 


4 


SEQ ID NO: 


5455 caggactggutgtciigg 


108 


127SEQ ID NO- 


'^ZL7Q pp^^anpppnapfotnpptn 
\j'-r 1 i3 L^LrddyoLfOyciU'iy lyv-rL/iy 


2018 


2037 


1 
1 


4 


SEQ ID NO: 


5456 ciiciaigicciciiiccc 


155 


1740CQ in MQ' 


^ZLftOnnnQpanapaiPPiPSiflns^Pin 
OH-ovj y yy ciLrciy oLfdi^dLrdciy ooy 


1076 


1095 


1 


4 


SEQ ID NO: 


5457 ttctatgtcctctticcca 


156 


175SEQ ID NO: 


oh-o I igyyaUaydUdLrduciciyeici 


1075 


1094 


1 


4 


SEQ ID NO 


5458tggttccacattcaagaga 


177 


196SEQ ID NO: 


o4o2 tctcaaiaatgatagacca 


1549 


1568 


1 


>1 


SEQ ID NO 


5459tgcctctgataaaacagtt 


325 


344SEQ ID NO: 


5483aactctgagatcttgggca 


1868 


1887 


1 


4 


SEQ ID NO 


5460agcccggctcctgggacag 


1064 


1083SEQ ID NO 


5484ctgtcctccagcctgggct 


2034 


2053 


1 


4 


SEQ ID NO 


: 546 1 agtctctgacacaagtcag 


1111 


1130SEQ ID NO 


5485 ctgaatggtaatggtgact 


1659 


1678 


1 


4 


SEQ ID NO 


: 5462 aaaaaggtgaatttttaaa 


1237 


1256SEQ ID NO 


5486tttattaaaacgacatttt 


2201 


2220 


1 


4 


SEQ ID NO 


: 5463acactctcaataatgatag 


1545 


1564SEQ ID NO 


5487 ctatgaatgatgcctgtgt 


2121 


2140 


1 


4 


SEQ ID NO 


5464aaagaatgaacgtgctcca 


37 


56SEQ ID NO 


5488tggacctccigtggacttt 


724 


743 


1 


3 


SEQ ID NO 


: 5465 ctttgggatccagtcgact 


59 


78SEQ ID NO 


5489 agtcagcggccgtgcaaag 


1124 


1143 


1 


3 


SEQ ID NO 


: 5466gtgatcgctgacctcagga 


132 


151 SEQ ID NO 


5490tcctctctccaaaggtcac 


1911 


1930 


1 


3 


SEQ ID NO 


5467ggaacgccttctatgtcct 


148 


167SEQ ID NO 


: 5491 aggactcatcactgcttcc 


1748 


1767 


1 


3 


SEQ ID NO 


: 5468 gactgtg ggcatcaatctc 


194 


213SEQ ID NO 


: 5492gagactggaccagggagtc 


357 


376 


1 


3 


SEQ ID NO 


: 5469ggacactgactattacagc 


296 


315SEQ ID NO 


: 5493 gctgaacgtctgtctgtcc 


518 


537 


1 


3 


SEQ ID NO 


: 5470 aagcccccgtcccagattg 


966 


985SEQ ID NO 


: 5494 caattgtttgctgg tgctt 


1833 


1852 


1 


3 
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Table 12. Selected palindromic sequences firom human B-catenin 



SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 

SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 

SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 

SEQ ID NO 

SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 



Source 

5495 agcagcttcagtcGGcgcc 
5496 ccattctggtgccactacc 
5497 tccttctctgagtggtaaa 
5498tctgagtggtaaaggcaat 
5499cagagggtacgagctgcta 

5500 ctaaatgacgaggaccagg 

5501 taaatgacgaggaccaggt 
5502gtcctgtatgagtgggaac 
5503 cccagcgccgtacglccat 
5504 tcccctgagggtatttgaa 
5505 gggtatttg aagtatacca 
5506gGtgttagtcactggcagc 

5507gtcctgtatgagtgggaac 
5508tcctgtatgagtgggaaca 
5509gtatgcaatgaGtcgagct 

551 0 gtccagcgtttggctgaac 

551 1 tatcaagatgatgcagaac 
551 2tatggtccatcagctttct 

551 ScGctggtgaaaatgcttgg 
55 1 4 agctttagg acttcacctg 
551 5ggaatGtttcagatgctgG 

551 etgtGcttcgggctggtgac 

551 7cacagctcctGtgacagag 
551 Sccagacagaaaagcggctg 
551 9cagGagcgttggGGGggGC 
5520aggtctgaggagcagGttc 

5521 actgttttgaaaatccagc 

5522 ctgatttgatggagttgga 

5523 Gcagacagaaaagcggctg 

5524acagctGGttctctgagtg 
5525tggataGGtGCGaagtGGt 
5526tGaagaaGaagtagctgat 
5527 agctcagagggtacgagct 

5528 gcatgcagatcccatctac 

5529 ccacaGgtgGaatGGGtga 

5530 cacacgtgcaatGGGtgaa 

5531 ggaccttgcataacctttG 
5532 ctccacaaccttttattac 
5533cagagtgctgaaggtgcta 
5534ggaGtGtGaggaatGtttG 
5535tgatataaatgtggtcacc 
5536GCcagcgccgtacgtccat 
5537gtccatgggtgggaGacag 
5538ttgtacGggagGGcttGac 
5539ttgttatcagaggactaaa 



Start End 

Index index 
70 



328 
334 
473 
677 
678 
383 



151 
260 

383 
384 
454 
563 
623 
718 

915 



89SEQ ID NO 
323SEQ ID NO 
347SEQ ID NO 
353SEQ ID NO 
492SEQ ID NO 
696 SEQ ID NO 
697SEQ ID NO 
402 SEQ ID NO 
1839 1856SEQ ID NO 
143 162SEQ1DNO 
170SEQ ID NO 

279sEQ ID NO 
402SEQ ID NO 
403SEQ ID NO 
473 SEQ ID NO 
582SEQ ID NO 
642SEQ ID NO 
737 SEQ ID NO 
934sEQ ID NO 
1291 1310SEQIDNO 
1356 1375 SEQ ID NO 
1549 1568q£q iq fsjo 

2107 2126SEQ ID NO 
245 264SEQ ID NO 

23SEQ ID NO 
79SEQ ID NO 
193SEQIDNO 

232 SEQ ID NO 
264sEQ ID NO 
342 SEQ ID NO 
388SEQ ID NO 
443SEQ ID NO 
488SEQ ID NO 
535SEQ ID NO 
664SEQ ID NO 
665SEQ ID NO 
865SEQ ID NO 
993SEQ ID NO 
1222 1241 SEQ ID NO 
1347 1366SEQIDNO 
1435 1454SEQIDNO 
1839 1858SEQ ID NO 
1852 1871 SEQ ID NO 
1915 1934SEQIDNO 
1962 1 981SEQIDNO 



4 
60 
174 
213 

245 

323 
369 
424 
469 
516 
645 
645 
846 
974 



Match Start End 


# 


B 


Index Index 








id. \ 


9171 


1 


5 


t3o4o gg idiggaccccaigaigg 






1 


5 


ob44tttattacaiGaagaagga 




1 UU4 


1 


5 


oo4o augiacgiaccaig caga 


1 


fti n 


1 


5 


DDHDiayuiyoayyyyLL-L.iuLy 




9n'^R 


1 


5 


5B47cctataaatcatcctttaa 


2539 


2558 


1 


5 


5548 aGctqtaaatcatGGttta 


2538 


2557 


1 


5 


5549gttccgaatgtGtgaggaG 


2176 


2195 


o 

Z ■ 


4 


5550atgggctQCGagatctggg 


2451 


2470 


9 


A 


5551 ttcaGatcctagGtcggga 


1929 


1948 


1 


4 


5552tggttaagctcttacaccG 


1680 


1699 


1 


4 


5553gGtgcctccaggtgacagc 


2494 


2513 


1 


4 


5554gttGgGGttGaGtatggac 


1652 


1671 




4 


5555tgttcGgaatgtctgagga 


2175 


2194 




4 


5556agGtggcctggtttgatac 


2517 


2536 




4 


5557gttcgccttcactatggaG 


1652 


1671 


1 


4 


5558gttcgtgGaGatGaggata 


1820 


1839 


1 


4 


5559 agaaagcaagctcatcata 


11 26 


1 14o 


1 


4 


5560cGaaagagtagctgcaggg 


2029 


2048 


1 


4 


5561 caggigacagcaatcagct 


2502 


2521 


1 


4 


5562 gcagctgctgttttgttcc 


2162 


2181 




4 


5563gtcatctgaccagGcgaGa 


1605 


1624 




4 


5564GtGtaggaatgaaggtgtg 


2134 


2153 




4 


5565cagctGgttgtaGGgctgg 


828 


847 




3 


5566ggGcaccacGGtggtgGtg 


2420 


2439 




3 


5567 qaaqaqgatgtggatacGt 


359 


378 




o 


5568 gctgatattgatggacagt 


437 


456 




o 
o 


5569tccaggtgacagcaatcag 


2500 


2519 




o 
o 


5570 caacaacaQtcttacctGa 


275 


294 




3 


5571 cactgagcGtgGGatctgt 








3 


ootZ aggactaaatacGatiGGa 


1 y ^ 


•1 Q01 
1 \3\S \ 




3 


5573 aiGagGtggcGiggtttga 


Zo14 


Zooo 




3 


5574 agctggtggaatgcaagct 


1276 


1295 




3 


5575 gtagaagctggtggaatgc 


1271 


1290 




3 


5576tGagatgatataaatgtgg 


1430 


1449 




3 


5577ttGagatgatataaatgtg 


1429 


1448 




3 


5578 gaaatcttgccctttgtcc 


1743 


1762 




3 


5579 gtaaatcatcctttaggag 


2542 


2561 




3 


5580tagctgcaggggtGctctg 


2037 


2056 




3 


5581 gaaatcttgGGctttgtcG 


1743 


1762 




3 


5582 ggtgacagggaagacatca 


1562 


1581 




3 


5583 atggccaggatgccttggg 


2370 


2389 




3 


5584 ctgtgaactlgctcaggaG 


2053 


2072 




3 


5585gtgaacttgctGaggacaa 


2055 


2074 




3 


5586 tttaggagtaacaatacaa 


2553 


2572 




3 
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SEQIDNO: 5540gaagctattgaagctgagg 2084 2103SEQ ID NO: 5587cctctgacagagttacttc 2114 2133 
SEQIDNO: 5541 tcagaacagagccaatggc 2247 2266 SEQ ID NO: 5588gccaccaccctggtgctga 2421 2440 



wo 2004/080406 



PCT/US2004/007070 



Table 13. Selected palindromic sequences from human hepatitis C viru s (HCV^ 



< 


Source 


Start 

Index 


End 

Index 






*^atGh 


Start 
Index 


End 
Index 


# 


B 


SEQ ID NO: 5589 


:agcacctgggtgctggta 


5314 


5333 ' 


3EQ ID NO: 
SEQ ID NO: 

SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 

SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 


6135 


.OGcaiicaGGcagctgctg 


6196 


6215 


1 


9 


SEQ ID NO: 5590< 


aactcgtccggatgcccgg 


1682 


1701s 


6136 


-cgggcagcgggtcgagtt 


8202 


8221 


1 


8 


SEQ ID NO: ^^^'^ 


:jgctQGtgggtagcgclca 


1049 


1068* 


61371 


:gagagcgacgccgGagcg 


6151 


6170 




7 


SEQ ID NO: 5592 


iJtccggatcGcacaagccg 


1352 


1371J 


6138( 


:^ggGatgtgggcccgggag 


6053 


6072 


1 


7 


SEQ ID NO: 5593 


igtaacatcgggggggtcg 


2048 


2067J 


6139( 


:igacccctcGcacattaca 


6871 


6890 


1 


7 


SEQ ID NO: 5594 


gtaacatcgggggggtcgg 


2049 


2068 i 


6140( 


[^cgacccctccGacattac 


6870 


6889 


1 


7 


SEQ ID NO: 5595 


:jagccaccaagcaggcgga 




5575 , 


6141 i 


tccggctggttcgttgGtg 


9254 


9273 


1 


7 


SEQ ID NO: 5596 
SEQ ID NO: 5597 
SEQ ID NO: 5598 
SEQ ID NO: 5599 
SEQ ID NO: 5600 
SEQ ID NO: 5601 
SEQ ID NO: 5602 
SEQ ID NO: 5603 
SEQ ID NO: 5604 
SEQ ID NO: 5605 
SEQ ID NO: 5606 
SEQ ID NO: 5607 
SEQ ID NO: 5608 
SEQ ID NO: 5609 
SEQ ID NO: 5610 
SEQ ID NO: 5611 
SEQ ID NO: ^61 2 
SEQ ID NO: 5613 
SEQ ID NO: 5614 
SEQ ID NO: 5615 
SEQ ID NO: 5616 
SEQ ID NO: 5617 
SEQ ID NO: ^^^^ 
SEQ ID NO: 5619 
SEQ ID NO: 5620 
SEQ ID NO: 5621 
SEQ ID NO: 5622 
SEQ ID NO: 5623 
SEQ ID NO: 5624 
SEQ ID NO: 562£ 
SEQ ID NO: 562€ 
SEQ ID NO: 5627 
SEQ ID NO: 562£ 
SEQ ID WO: 562£ 
SEQ ID NO: 563C 
SEQ ID NO: 5631 


ctcaGGacccagaacaccG 


5744 


5763. 


6142 


gggtgtgcacggtgttgag 


6291 


6310 


1 


7 


ccagcottaccatcaccca 


6189 


6208, 


6143 


tgggcgctggtatcgctgg 


5832 


5851 


^ 


7 


ctacgccgtgttccggctc 


6249 


6268 


6144 


gagccGgaacGggaGgtag 


6830 


6849 




' 7 


tacgccgtgttccggctcg 


6250 


6269 


6145 


cgagcccgaaGcggacgta 


6829 


6848 




, 7 


gagttcctggtaaaagcct 


8216 


8235 


6146 


aggctatgactaggtactc 


8634 


8653 




. 7 


atggcggggaactgggcta 


1430 


1449 


6147 


tagcgcattttcactGcat 


9019 


9038 




6 


aaccaaacgtaacaccaac 


370 


389 


6148 


gttgccgctaccttaggtt 


4115 


4134 




6 


ggtggtcagatcgttggtg 


419 


438 


6149 


caccagcccgctcaccacc 


5734 


5753 




6 


ccttggcccctctatggca 


584 


603 


6150 


tgccaacglgggtacaagg 


6374 


6393 


1 


6 


taccccggccacgcgtcag 


1265 


1284 


6151 


ctgacgactagctgcggta 


8465 


8484 


1 


6 


g g gcacg ctg cccgcctca 


1508 


1527 


6152 


tgagacgacgacGglgccG 


4759 


4778 




6 


ctgcaatgactccctccag 


1624 


1643 


6153 


ctggtggccctGaatgcag 


2594 


2613 




6 


aaccgatcgtctcggcaac 


1897 


1916 


6154 


gttgccgctaccttaggtt 


41 1 5 


41 34 




6 


gtgcggggcccccccgtgt 


2032 


2051 


6155 


acaccacgggcccctgcac 


6537 


6556 




6 


atgtggggggcgtggagca 


2238 


2257 


6156 


tgctcaatgtcGtacacat 


7610 


7629 


'I 


6 


ggagagcgttgcaacttgg 


2288 


2307 


6157 


ccaagctcaaactGactcc 


9207 


9226 




6 


cgtccgttgccggagcgca 


2613 


2632 


6158 


tgcgagcccgaaccggacg 


6827 


6846 




6 


gtctggcattattgacctt 


2817 


2836 


: 6159 


aaggtcacctttgacagac 


7763 


7782 




. 6 


tctttgatatcaccaaact 


2997 


3016 


: 6160 


agttcgatgaaatggaaga 


5454 


5473 




6 


cttctgattgccatactcg 


3014 


3033 


: 6161 


cgagcaattcaagcagaag 


5518 


5537 


1 


6 


gcggcgtgtggggacatca 


3314 


3333 


: 6162 


tgatcacgccatgcgccgc 


7641 


7dd0 




D 


gggacatcatcctgggcct 


3324 


3343 


: 6163 


aggcggtggattttgtccG 


391 o 


oyo4 




b 


gggcgtcttccgggccgct 


387^ 


3893 




agcggcacggcgaccgccc 






* 


\j 


ggcgtcttccgggccgctg 


3875 


3894 


. 6165 


GagGggcacggcgaGcgcc 


7438 


7457 




6 


gcgtcttccgggccgctgt 


3876 


3895 


: 6166 


acaggtgccctgatcacgc 


7631 


7650 




6 


gtccccggtcttcacagac 


3961 


3980 


: 6167 


gtcttggaagaacccggac 


7252 


7271 




6 


catcaggactggggtaagg 


4MA 


■ 4193 


: 6168 


r Gcttcctcaagccgtgatg 


8155 


8174 




6 


1 ccgacggtggttgctccgg 


424£ 


4264 


SEQ ID NO 


. 6168 


* ccgggggaacggccctcgg 


4853 


4872 




6 


^ggggggaaggcacctcatt 


4501 


452C 


SEQ ID NO 


: 61 7C 


taatgttgtgacttggcccc 


8334 


8353 


1 


6 


jccgagoaattcaagcagaa 


5517 


5536 


SEQ ID NO 


: 6171 


ttctgattgccatactcgg 


3015 


3034 




6 


Jagatgaaggcaaaggcgtc 


7821 


784C 


SEQ ID NO 


: 6172 


[ gacgaccttgtcgttatct 


8564 


8583 




6 


' cccctag g g g g cgctg cca 


767 


' 78€ 


SEQ ID NO 
SEQ ID NO 


. 6172 


I tggccggcgccccccgggg 


3674 


3693 


1 3 


5 


\ ctcccggcctagttggggc 


64€ 


) 66£ 


: 617^ 


gcGccGCcttgagggggag 


751£ 


> 753S 


! 2 


5 


) ttccgctcgtcggcggccc 


75C 


) 76£ 


SEQ ID NO 


; 6176 


igggcaaaggacgtccggaa 


7922 


\ 7942 


: 2 


5 


)cccctagggggcgctgcca 


767 




jSEQ ID NO 


: 6176 


^tggcgggggcccactgggg 


1382 


\ 1402 


! 2 


5 


1 gccccgccggcatgcgaca 


1222 


> 1241 


SEQ ID NO 


: 5177 


' tgtcccagggggggagggc 


9147 


' 91 6€ 


\ 2 


: 5 
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SEQ 


ID 


NO: 


O 1 — /"N 


\D 


NU. 


olz^ 




MO* 




in 


1 NSW. 


SEQ 


ID 


NO* 


SEO 


ID 


NO* 


SEQ 


ID 


NO: 


SEQ 


ID 


WO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEO 


ID 


NO- 




lU 


MO* 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 




ID 


MO- 




in 


NO- 




in 






lU 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


WO: 


SEQ 


ID 


NO: 



5632 



aggacgaccgggtcctttc 



5633 ggacgaccgggtcctttct 



5634 aaaaccaaacgtaacacca 



5635 caaccgccgcccacaggac 
5636 
5637 



5638tgccgcgcaggggccccag 



5639 gggccccaggttgggtgtg 



5640 gttggggccccacggacxx; 



5641 ttggggccccacggacccc 

5642ltggggccccacggaccccc 



5643 cctcacatgcggcctcgcc 



5644 cacatgcggcctcgccgac 
5645ltccgctcgtcggcggcccc 



5646 ggcgctgccagggccttgg 



5647 ccatgtcacgaacgactgc 



5648 gtgccctgcgttcgggagg 



5649tgccctgcgttcgggaggg 



5650 gccctgcgttcgggagggt 

5651 

5652 
5653 



5652tccccactacgacaatacg 



5654atttgctcgttggggcggc 



5655 ccttctcgccccgccggca 



5656accccggccacgcgtcagg 



cggtggtcagatcgttggt 



acctgttgccgcgcagggg 



176 



179 



368 



385 



418 



444 



460 



460 



657 



658 



715 



718 



751 



776 



aggaatgctaccatcccca 



atacgacaccacgtcgatt 



5657 gccctcgtagtgtcgcagt 



5658gccgtctcagagaatccag 



5659 ctgaactgcaatgactccc 



5660 agactgggttt cttgccgc 
5661 

5662 



tcgtccggatgcccggagc 



ccagggatggggtcctatc 



5663 gacaaccgatcgtctcggc 



5664 caagacgtgcgggg ccccc 



5665acgtgcggggcccccccgt 



5666 ccggaagcaccccgaggcc 



5667 aggccacgtactcaaaatg 



5668 tgtatgtggggggcgtgga 



5669gagtggcaggttctgccct 



5670 tcctttgcaatcaaatggg 



agcccaggccgaggccgcc 



5671 

i672 

5673 gcggcatatgctttctatg 



5672 ggcggcatatgctttctat 



5674 cggcatatgctttctalgg 
5675 

5676 Gccccctcaacgtccgggg 



5677 gggcaggggtggcgactcc 



5678 atgttggactgtctaccat 



5679 tgttggactgtctaccatg 



943 



1019 



1020 



1021 
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1685 
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6178 


aaaaaaaaacaattatcct 


7341 


7360 


i 
1 


5 


6179 


aaaaaaaaciacaattatnr' 


7340 




1 


5 


6180 


tggtttttttttttttttt 


9443 


9462 


1 


5 


6181 


gtcctqaacccqtctqttq 


4100 


4119 


1 


5 


6182 


accattgagacgacgaccg 


4754 


4773 


1 


5 


6183 


ccccggccacgcgtcaggt 


1267 


1286 


1 


5 


6184 


ctgggcgcgctgacgggca 


3164 


31 83 


1 


5 


6185 


cacagcctgtctcgtgccc 


9296 


9315 


1 


5 


5186 


gggtggglagccgcccaac 


5783 


5802 


1 


5 


6187 


ggggtgggtagccgcccaa 


5782 


5801 


1 


5 


6188 


gggggtgggtagccgccca 


5781 


5800 


1 


5 


5189 


* 

ggcggggcgacaatagagg 


3774 


3793 


1 


5 


6190 


gtcgtcggagtcgtgtgtg 


6020 


6039 


1 


5 


6191 


ggggcaaaggacgtccgga 


7922 


7941 


1 


5 


6192 


ccaagccacagtgtgcgcc 


5110 


5129 


1 


5 


6193 


gcagcaacacgtggcatgg 


6498 


6517 


1 


5 


6194 


cctcacaacgggggggcac 


1495 


1514 


1 


5 


6195 


ccctcacaacgggggggca 


1494 


1513 


1 


5 


6196 


accctcacaacgggggggc 


1493 


1512 


1 


5 


6197 


taaacatcaacacaatcct 


4323 


4342 


1 


£; 
\j 


6198 


cotattcccaaatttaaaa 


8092 


8111 
will 


1 


5 


6199 


aatcaatoctotaacatat 


4576 


4595 


1 


5 


6200 


□ CCQ ccactta caacaaa t 

Sj \^^y>^ wl^y fcL^j ^^Vj VJ ^^falWt^l L 
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A 
1 


5 
«j 
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tQCoaanataontafiaann 


6374 


fi3Q3 


-1 
1 


\j 
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1 




6203 


actacatcaQcatataQOp 


6046 




-1 
1 




6204 
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5857 


1 
1 




6205 


aaaacaaatcaaaactcaa 


2313 


2332 
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\J £-mmt KJ 


Q CO a caa n ccta en a a tct 


8609 


8628 


1 




6207 


Q ctccQ a CI a acartttacnf^ 

y vr ivfoy y y y y Vr ita\*y a 


4257 


4276 


1 




6208 


aataacttcccctacctoa 


5084 


5103 


-1 
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y v^v^y ^y y i^c«v^»-ry y y ly 


6343 


6362 


-1 
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aaaatctcccccctcctta 


6919 


6938 


1 
1 


5 


6211 


a CO a a cq cccccattaco t 


4202 


4221 


1 

1 


5 
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aaccactatatacacccao 

y y v^Vd^y «b^Ly LCiLy\yciv^woy y 


3886 

Vh^ 


3905 


1 


\j 


6213 


cattatatccaaataocct 


3137 


3156 


1 

1 
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tccaagtggcccatctaca 


4011 


4030 


1 


5 


6215 


agggcaggggtggcgactc 


3400 


3419 


1 


,5 


6216 


cccaccttatgggcaagga 


8861 


8880 


1 


6 


6217 


ggcgtccacagtcaaggct 


7834 


7853 


1 


5 


6218 


atagaagaagcctgccgcc 


7865 


7884 


1 


5 


6219 


catagaagaagcctgccgc 


7864 


7883 


1 


5 


6220 


ccatagaagaagcctgccg 


7863 


7882 


1 


6 


6221 


ggggggacggcatcatgca 


6402 


6421 


1 


5 


6222 


ccccaatcgatgaacgggg 


9376 


^^^^^^ ^5, 


1 


5 


6223 


ggaggcogcaagccagccc 


8066 


8085 


1 


5 


6224 


stggt'SGcgaccctaacat 


4158 


4177 


1 


5 


6225 


catggtaccgaccctaaca 


4157 


4176 


1 
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6226t 


gcacgatgctcgtgaacg 


8543 


8562 


1 
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icaccatacacctatci oca 


3704 


37235 


6227t 


gccgcggttaccgggtgt 


6342 


6361 


1 


5 
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iaccatacacctataacaa 


3705 


3724$ 
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jtgccgcggttaccgggtg 


6341 
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1 


5 
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lacatcaQcacaatcctQCi 


4325 


4344 < 


6229 c 


xaggattgcccgtttgcc 


4979 
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1 


5 
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saacaQaaacoQctaaoQc 


4347 
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6230 c 


jctccccccagcgctgctt 


5804 
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1 


5 
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laaacacaacttatcatac 


4361 


4380 < 


6231 [ 


^cacggcgaccgcccctcc 


7443 
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1 
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SEQ ID NO: 5686 


^yaag ccdicjsictg gg y y y di 


AARQ 


4. 'SOB* 


62321 


cccccca Q CQ ctq cttcg 


5800 


5825 


1 
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7225 


7244 


1 


5 
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6234? 
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1 
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SEQ ID NO: 5689 
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6237 


□ctcctcatacoci attcca 


8175 


8194 


1 


5 


tqqcqagctttggagacct 


5603 


5622, 


6238 


aggtgccctgatcacgcca 


7633 


7652 


1 


5 


gcccgctcaccacccagaa 


5739 


5758 


6239 


ttctggcgggctatggggc 


5895 


5914 


1 


5' 


tqagtgacttcaagacctg 


6306 


6325 


6240 


caggctataaaatcgctca 


8363 


8382 


1 


5 


atqtcaaaaacggttccat 


6456 


6475 


6241 


atggtaccgaccctaacat 


4158 


4177 


1 


5 


ccgaaaacctgcagcaaca 


6488 


6507 


6242 


tgttcctccaatgtgtcgg 


8708 


8727 


1 


5 


ggcgccaaactattccaag 


6565 


6584 


6243 


cttgaaagcctctgccgcc 


8500 


8519 


1 


5 


gccctccttgagggcgaca 


6967 


6986 


6244 


tgtctcctacttgaagggc 


3814 


3833 


^ 


5 


cacccgcgtggagtcggag 


7078 


7097 
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ctccggtggtacacgggtg 
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7297 


1 


5 


»3 Cf v3^»J»Jw w w w 


7138 


7157 


6246 


ttcatgctgtgcctactcG 


9326 


9345 


1 


5 


qcqqcgatacccatatggg 


7202 


7221 


6247 


cccagggggggagggccgc 


9150 


9169 


1 


5 


iiyccaCLpLyiuciayy wulf 




7320 


6248 


aqqccqccacttgcgqcaa 


9162 


9181 


1 


5 


ccccccciiy dy y y y y ay ^ 




7539 


6249 


qctcccggcctagttggg g 


645 


664 


1 


5 


U ly U ly L. LUcl a ly LLf O LCJ L» 


7606 


7625 


6250 


qtaqqactgqcaggggcag 


4809 


4828 


1 


5 


caiggacagyigcooiyaL 


1 \j£-\J 


7645 


; 6251 


atcattgaacgactccatg 


8996 


9015 


1 


5 


aiggacagg ly ccciy j lo 


1^91 


7646 


: 6252 


gatcattgaacgactccat 


8995 


9014 


1 


5 


y g c la ig aoiciygiciuiL.o 




8654 


: 6253 


QQaQcaacttgaaaaagcc 


8920 


8939 


1 


5 


caccaiayaiC'doiL.uuoL 


91 


46 
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aaqqccttqqcacatggtg 


785 
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2 


4 


agctgttcaccttctcgcc 


1206 
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: 6255 


ggcgtgctgacgactagct 


8459 


8478 


2 


4 


ctgcaatgactccctccag 


1624 


1643 


: 6256 


ctggtgcggctgttggcag 


5847 


5866 


2 


4 


atgtggggggcgtggagca 


2238 


2257 


: 6257 


tgctgcgccatcacaacat 


7701 


7720 


2 


4 


tggggacatcatcctgggc 


3322 


3341 


: 6258 
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5795 


5814 


2 
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: 6269 


aggcaggagataacttccc 
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5095 
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4 
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r 
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\ 802 


1 1 


4 
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9e 
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4 
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6405 


I 6421 


1 


4 
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4 
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i 4 
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25' 
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i gaccaggatctcgtcggct 
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3 ' 


1 4 
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1 4 
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^gatcatgcatactcccggg 
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Igccgcgcaggggccccagg 
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1 4 


accccgtggaaggcgacag 
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6284 


« gaggccgcaagccagcccg 


8067 


' 808e 


I 1 


4 


gcatggggtgggcaggatg 


609 
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: 6285 


i catcgataccctcacatgc 


70€ 


I 72£ 


i 1 


4 


tcctgtcaccccgcggctc 


630 


649 


: 6286 


! gagctgcaaagctccagga 


8522 


; 8542 


: 1 


4 


gggccccacggacccccgg 


661 


68Q 


: 6287 


ccggccgcatatgcggccc 


4064 


4083 


> 1 


4 


ggccccacggacccccggc 


662 


681 


; 6288 


gccggccgcatatgcggcc 


4063 


4082 


1 


4 


cggcctcgccgacctcatg 


724 


743 


: 6289 


catgaggatcatcgggccg, 


6472 


6491 


1 


4 


ggcctcgccgacctcatgg 


725 


744 


: 6290 


ccatgaggatcatcgggcc 


6471 


6490 




4 


ggccccctagggggcgctg 


764 


783 


: 6291 


cagctccgaattgtcggcc 


7414 


7433 


'J 


4 


tggcacatggtgtccgggt 


792 


811 


. 6292 


acccacgc^gcacgggcca 


5188 


5207 




4 


cttcctcttggctctgctg 


868 


887 


: 6293 


cagcataggtcttgggaag 


5863 


5882 




4 


catgtcacgaacgactgct 


944 


963 


: 6294 


agcagtgctcacttccatg 


6847 


6866 




4 


gaggcggcggacttgatca 


983 


1002 


: 6295 


tgatggcattcacagcctc 


5712 


5731 




4 


catccccactacgacaata 


1096 


1115 


6296 


tattaccggggtcttgatg 


4592 


4611 




4 


gctgttcaccttctcgccc 


1207 


1226 


6297 


gggctgcgtgggaaacagc 


8793 


8812 




4 


gccccgccggcatgcgaca 


1222 


1241 


6298 


tgtctcctacttgaagggc 


3814 


3833 




4 


tggcctgggacatgatgat 


1293 


1312 


6299 


atcaatttgctccctgcca 


5981 


6000 




4 


cacaagccgtcatcgacat 


1362 


1381 


6300 


atgtttgggactgggtgtg 


6279 


6298 




4 


agccgtcatcgacatggtg 


1366 


1385 


6301 


caccaagcagg eg g ag get 


5560 


5579 




4 


Sgtggcgggggcccactgg 


1381 


1400 


6302 


ccagggctcaggccccacc 


5127 


5146 




4 


gggggcccactggggagtc 


1387 


1406 


6303 


gactaggtactccgccccc 


8641 


8660 




4 


atggcggggaactgggcta 


1430 


1449 


6304 


tagcagtgctcacttccat 


6846 


6865 




4 


ttgattgtgatgctacttt 


1454 


1473 


6305 


aaagcaagctgcccatcaa 


7665 


7684 




4 


caacgggggggcacgctgc 


1500 


1519 


6306 


gcagaaggcgctcgggttg 


5530 


5549 




4 


acgctgcccgcctcaccag 


1512 


1531 


6307 


ctggacccgaggagagcgt 


2278 


2297 




4 


tcagagaatccagcttata 


1564 


1683 


6308 


tatatcgggggtcccctga 


8393 


8412 




4 


accaatggcagttggcaca 


jk #-1 j-fc 

1586 


1605, 


63091 


tgtggctcggggccttggt 


2132 


2151 




4 


[jcaatggcagttggcacat 


1587 


1606, 


6310i 


atgtggctcggggccttgg 


2131 


2150 




4 


^tcctatcacttatgctga 


1749 


1768j 


6311 


tcaggactggggtaaggac 


4176 


4195 
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6312 


gggtggcttcatgcctcag 


9063 


9082 




4 


SEQ ID NO: 5767 ( 


:5aggtgtgtggtccagtgt 


1844 


1863J 


6313c 


acactccagttaactcctg 


8817 


8836 




4 


SEQ ID NO: ^7681 


:gtggtccagtgtattgct 


1850 


1869| 


631 4 « 


^gcagggccatcaaccaca 


7949 


7968 




4 


SEQ ID NO: S769c 


gcttcaccccaagtcctgt 


1866 


1885( 


6315c 


acagcagaggcggctaagc 


6887 


6906 




4 


SEQ ID NO: 5770( 


^tgttgtcgtggggacaac 


1881 


1900 J 


6316c 


3tl:gcaacttggacgacag 


2295 


2314 




4 


SEQ ID NO: 5771 c 


gcc^ccgcaaggcaactgg 


1972 


1991* 


6317c 


xagttggacttatccggc 


9241 


9260 




4 


SEQ ID NO: 5772c 


3gcaactggttcggctgta 


1982 


2001 < 


631 8 1 


acacgggtgcccattgcc 


7287 


7306 




4 


SEQ ID NO: 5773c 


3caactggttcggctgtac 


1983 


2002 < 


6319c 


jtacacgggtgcccattgc 


7286 


7305 


1 


4 


SEQ ID NO: 5774c 


xccgtgtaacatcggggg 


2043 


2062 < 


6320 c 


"cccaatcgatgaacgggg 


9376 


9395 


1 


4 
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SEQ ID NO: 5775( 


ggactgcttccggaagcac 


2092 


2111, 


SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO" 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 

SEQ ID NO 
SEQ ID NO 
SEQ ID NO 

SEQ ID NO 
SEQ ID NO 
SEQ ID NO 

iSEQ ID NO 
SEQ ID NO 


6321 


gtgctggtaggcggagtcc 


5324 


5343 


1 


4 


SEQ ID NO: 5776j 


gactgcttccggaagcacc 


2093 


2112, 


6322 


ggtgctggtaggcggagtc 


5323 


5342 


1 


4 


SEQ ID NO: 57771 


tccggaagcaccccgaggc 


2100 


2119, 


6323 


gcctacgagtcttcacgga 


8616 


8635 


1 


4 


SEQ ID NO: 5778; 
SEQ ID NO: 5779 
SEQ ID NO: 5780 
SEQ ID NO: 5781 
SEQ ID NO: 5782 
SEQ ID NO: 5783 
SEQ ID NO: 5784 
SEQ ID NO: 5785 
SEQ ID NO: 5786 
SEQ ID NO: 5787 
SEQ ID NO: 5788 
SEQ ID NO: 5789 
SEQ ID NO: 5790 
SEQ ID NO: 5791 
SEQ ID NO: 5792 
SEQ ID NO: 5793 
SEQ ID NO: 5794 
SEQ ID NO: 5795 
SEQ ID NO: 5796 
SEQ ID NO: 5797 
SEQ ID NO: 5798 
SEQ ID NO: 5799 
SEQ ID NO: 5800 
SEQ ID NO: 5801 
SEQ ID NO: 5802 
SEQ ID NO: 5803 
SEQ ID NO: 5804 
SEQ ID NO: 5805 
SEQ ID NO: 5806 
SEQ ID NO: 5807 
SEQ ID NO: 5808 
SEQ ID NO: 5809 
SEQ ID NO: 5810 
SEQ ID NO: 5811 
SEQ ID NO: 5812 
SEQ ID NO: 5813 
SEQ ID NO: 5814 
SEQ ID NO: 5815 
SEQ ID NO: 5815 
SEQ ID NO: 5817 
SEQ ID WO: 5818 
SEQ ID NO: 581S 
SEQ ID NO: 582C 
SEQ ID NO: 5821 
SEQ ID NO: 5822 


actcaaaatgtggctcggg 


2124 


2143 


6324 


cccgggcagcgggtcgagt 


8201 


8220 


1 


4 


ggccttggttgacacctag 


2142 


2161 


6325 


ctagccggcccaaaaggcc 


3611 


3630 


1 


4 


aggagagcgttgcaacttg 


2287 


2306 


6326 


caagccgtgatgggctcct 


8162 


8181 


1 


4 


ggacagatcggagctcagc 


2314 


2333 


6327 


g ctggg g gtcattatgtcc 


3128 


3147 


1 


4 


cagatcggagctcagcccg 


2317 


^^^3^3 


6328 


cgggtggcccactgctctg 


3837 


3856 


1 


4 


ggagctcagcccgctgctg 


2323 


2342 


6329 


cagctgctgaagaggctcc 


6206 


6225 


1 


4 


caccctaccggctctgtcc 


2383 


2402 


6330 


ggactgggtgtgcacggtg 


6286 


6305 


1 


4 


cggctctgtccactggctt 


239 1 


2410 


6331 


aagcaggcggaggctgccg 


5564 


5583 


1 


4 


ccatcagaacatGgtggac 


2419 


2438 


6332 


gtccccgttgagtccatgg 


3929 


3948 


1 


4 


ggtcagcggttgtctcctt 


2460 


2479 


6333 


aaggatgattctgatgacx: 


8875 


8894 


1 


4 


g ccgccttagagaacctgg 


2579 


2598 


6334 


ccagttggacttatccggc 


9241 


9260 


1 


4 


gccttagagaacctggtgg 


2582 


2601 


6335 


ccaccaagcaggcggaggc 


5559 


5578 


1 


4 


gccggagcgcacggcatcc 


2621 


2640 


6336 


ggattgggcccacgccggc 


3214 


3233 


1 


4 


gctgcatcgtgcggaggcg 


2786 


2805 


6337 


cgccacgacatcccgcagc 


7726 


7745 


1 


4 


attattg a ccttgtcg cca 


2824 


2843 


6338 


tggcaacagacgctctaat 


4647 


4666 


1 


4 


tcgccatattacaaggtgt 


2837 


2856 


6339 


acacaatctttcctggcga 


3539 


3558 


1 


■4 


cgccatattacaaggtgtt 


2838 


2857 


6340 


aacacaatctttcctggcg 


3538 


3557 


1 


4 


gtccggggaggccgcgatg 


2939 


2958 


6341 


catcggcacagtcctggac 


4327 


' 4346 


1 


4 


tcaccccactgcgggattg 


3201 


3220 


6342 


caatttaccaatgttgtga 


8325 


8344 


1 


4 


ttgggcccacgccggccta 


3217 


3236 


► 6343 


taggctaggggccgtccaa 


5221 


5240 


1 


4 


ctacgggaccttgcggtag 


3233 


3252 


: 6344 


ctactcctactttctgtag 


9338 


9357 


1 


4 


cxitgtcgtcttctctgaca 


3260 


3279 


: 6345 


tgtcctacacatggacagg 


7617 


7636 


1 


4 


ctgtcgtcttctctgacat 


3261 


3280 


: 6346 


atgtcxjtacacatggacag 


7616 


7635 


1 


4 


cctggggggcagacaccgc 


3297 


3316 


: 6347 


gcggggtaggactggcagg 


4804 


4823 


1 


4 


gggggcagacaccgcggcg 


3301 


3320 


: 6348 


cgcccaactcgctcccccc 


5794 


5813 


1 


4 


ggcgtgtggggacatcatc 


3316 


3335 


: 6349 


gatgttattccggtgcgcc 


3755 


3774 


1 


4 


tggggccggccgatagtct 


3378 


3397 


: 6350 


agacgacgaccgtgcccca 


4761 


4780 


1 


4 


gaaccaggtcgagggggag 


3499 


3518 


: 6351 


ctccacctatggcaagttc 


4222 


4241 


1 


4 


gagggggaggttcaagtgg 


3509 


3528 


: 6352 


ccacctgtcaaggcccctc 


7304 


7323 


1 


4 


aggcccaatcgcccagatg 


3625 


3644 


: 6353 


catcccgcagcgcgggcct 


7734 


7753 


1 


4 


ggcccaatcgcccagatgt 


3626 


3645 


: 6354 


acatcccgcaQcgcgggcc 


7733 


7752 


1 


4 


caggatctcgtcggctggc 


3659 


3678 


: 6355 


gccaataggccatttcctg 


9410 


9429 


1 


4 


aggatctcgtcggctggcc 


3660 


3679 


: 6356 


gg cca ata g gccatttcct 


9409 


9428 


1 


4 


gccccccggggcgcgttcc 


3682 


3701 


; 6357 


ggaacctatccagcagggc 


7938 


7957 


1 


4 


gcacctgtggcagctcgga 


3711 


3730 


: 6358 


tccggtggtacacgggtgc 


7279 


7298 


1 


4 


ctgtggcagctcggacctt 


3715 


3734 


6359 


aaggcaaaggcgtccacag 


7826 


7845 


1 


4 


gcggggogacaatagaggg 


3775 


3794 


: 6360 


ccctgcctgggaaccccgc 


5682 


5701 


1 


4 


i ggagcttgctctcccccag 


3792 


3811 


: 6361 


ctggttgggtcacagctcc 


6806 


6825 


1 


4 


►gagcttgctctcccccagg 


3793 


. 3812 


: 6362 


cctggttgggtcacagctc 


6805 


6824 


1 


4 


' acttgaagggctcttcggg 


3822 


: 3841 


; 6363 


cccgtggtggagtccaagt 


558 £ 


5604 


1 


4 


s tgtccccgttgagtccatg 


3928 


\ 3947 


: 6364 


catggtctacgccacgaca 


7717 


7736 


1 


4 


J gaaactactatgcggtccc 


3947 


' 3966 


: 6365 


gggaaggcacctcattttc 


4504 


4523 


1 


4 


1 aaactactatgcggtcccc 


3948 


\ 3967 


: 6366 


ggggggcatatacaggttt 


4828 


4847 


1 


4 


clcccsctggcagcggcaa 


4032 


\ 4051 


SEQ ID NO 


: 6367 


ttgccaggaccatctggag 


4993 


5012 


1 


4 


\ ggcgtatatgtctaaagca 


4138 


1 4157 


SEQ ID NO 


: 6368 


tgctcgccaccgctacgcc 


4377 


4396 


1 


4 
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NO" 
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ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO* 


SEQ 


ID 


NO- 


SEQ 

mmm 


ID 


NO' 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 




ID 




ceo 


in 
ID 


MO- 

NvJ, 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 


SEQ 


ID 


NO: 



5823 gcgtatatgtctaaagcac 



5824tggggtaaggaccattacc 



5825accattaccacgggcgccc 



5826cgtactccacctatggcaa 
582' 
5828; 



7 cagtcctggaccaagcgga 



5829cactcxaagaagaagtgcg 



5830 atcaatgctgtagcgtatt 
5831 



5832aggactggcaggggcaggg 



5833gggaacggccctcgggcat 
5834 



5835tggtacgagctcacccccg 



5836gggcttacctaaatacacc 



5837 ggcttacctaaatacacca 



583dgagataacttcccctacct 



5839cccacctccatcgtgggat 



584Q |catggcatgcatgtcggcc 
5841 Iggccgacctggaagtcgtc 



5842gccgacctggaagtcgtca 



5843tggaagtcgtcaccagcac 



5844gcacctgggtgctggtagg 



5845ggttatcgtgggtaggatc 



5846cccgatagggaagtcctct 



aqqqqpqaagqcacctcat 



cataccgaccagcggagac 



eg g g catg ttcg attcct 



n 



5847tgaaatggaagaatgcgcc 



5848ccaagtggcgagctttgga 



5849ttcatcagcgggatacagt 



5850agcgggcttatccaccctg 

585l|ccagcccgctcaccaccca 



5852gtgggcgctggtatcgctg 



5853ggaaggtgctagtggacat 



5854ggtcatgagcggcgaggcg 



5855catgtgggcccgggagagg 
5856latgtgggcccgggagaggg 



5857ggggccgtgcagtggatga 



5868 gcgttcg cttcg egg ggta 



5859 ggggtaaccatgtctcccc 



5860 catcacccagctgctgaag 

5861 laggactgttctacgccgtg 



5862 ttcaagacctggctccagt 



5863ctcctgccgcggttaccgg 



5864caccacgggcccctgcacg 



5865ggaggtcacgcgggtgggg 



5866 gaggtcacgcgggtggggg 



5867 atgtcaggttccagctcct 
5868atgaaatatccattgcggc 



5869 ctccattgttagagtcttg 



5870 tgcccattgccacctgtca 



4139 



4183 



4193 



4218 



4335 



4500 



4526 



4577 



4618 



4811 



4857 



4869 



4922 



4962 



4963 



5082 



5140 



5278 



5293 



5294 



5301 



5316 



5383 



5429 



5461 



5598 



5645 



5668 



5736 



5831 



5877 



5944 



6056 



6057 



6074 



6104 



6117 



6199 



6240 



6314 



6338 



6538 



6616 



6617 



6682 



7152 



7239 



7295 



4158SEQ 



4202 SEQ 



421 2 SEQ 



4354 



4619 



4237SEQ 
SEQ 
SEQ 
SEQ 
SEQ 
SEQ 



4545 



4596 



4637 



4830 SEQ 



4888 



4876SEQ 

SEQ 

SEQ 
SEQ 



4941 



4981 



4982 SEQ 
5101 SEQ 



5159SEQ 



5297 SEQ 



5312SEQ 



5313SEQ 



5320SEQ 



5335 SEQ 



5402 SEQ 



5448 SEQ 



5480 SEQ 



5617SEQ 



5664SEQ 



5687SEQ 



5850 



5896SEQ 



5963 SEQ 



6093 SEQ 



6136 



621 8 SEQ 



6259SEQ 



6557 SEQ 



6635SEQ 



6636 SEQ 
SEQ 
SEQ 



6701 



7171 



7258SEQ 



7314SEQ 



5755SEQ 
SEQ 



6075 SEQ 
6076SEQ 



6123SEQ 
^SEQ 



6333 SEQ 
6357 SEQ 



ID NO 
ID NO 
ID NO 

ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 

ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 

ID NO 
ID NO 

ID NO 
ID NO 
ID NO 

ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 



6369 


qtactcqccaccqctacqc 


4376 


4395j 1 


4 


6370 


aataaccatatctccccca 


6119 


6138 


1 


4 


6371 


gggcgctggtatcgctggt 


5833 


5852 


1 


4 


6372 


ttgccccaaccagaatacg 


8669 


8688 


1 


4 


6373 


tccgtgagccgcatgactg 


9560 


9579 


1 


4 


B374 


atgagcggcgaggcgccct 


5948 


5967 


1 


4 


6375 


cgcatgactgcagagagtg 


9569 


9588 




4 


6376 


aatacgacttggagttgat 


8682 


8701 




4 


6377 


gtctcccccacgcactatg 


6128 


6147 




4 


6378 


ccctgccatcctctctcct 


5992 


6011 




4 


6379 


atgotcaccgaocco'tcco 


B863 


6882 




4 


6380 


gaggccgcaagccagcccg 


8067 


8086 




4 


6381 


ca a a a actta ccccaacca 


8662 




T 


4 

r 


6382 


aatQactccatcttaaccc 


9518 


9537 




4 


6383 


tpatciactccatcttaacc 


9517 


9536 




4 


6384 


aqqttqqccaQQqqqtctc 


6908 


6927 




4 


6385 


atccaaqtttqqctatqqq 


7906 


7925 




4 


6386 


qacctctctacaaatcata 


9596 


9615 






6387 


qacqcccccacattcqqcc 


7885 


7904 




4 


6388 


tgacgcccccacattcggc 


7884 


7903 




, 4 


6389 


gtgcccatgtcaggttcca 


6676 


6695 




4 


6390 


cctacacatggacaggtgc 


7620 


7639 




4 


6391 


gatcatcgggccgaaaacc 


6478 


.6497 




4 


6392 


as ae eg g ctttatatcg g g 


8383 


8402 




4 


6393 


ggcgcgctcgtggccttca 


5924 


5943 




4 


6394 


tccattgttagagtcttgg 


7240 


7259 


-1 


4 


6395 


actgcacgatgctcgtgaa 


8541 


8560 


1 


4 


6396 


caggggtggctggcgcgct 


5913 


5932 


1 


4 


6397 


tgggcgctggtatcgctgg 


5832 


5851 


1 


4 


6398 


cagcagggccatcaaccac 


7948 


7967 


1 


4 


6399 


atgtggtctccacccttcc 


8142 


8161 




4 


6400 


cgcccctcctgaccagacc 


7453 


7472 


i 


4 


6401 


cctccttgagggcgacatg 


6969 


6988 


1 


4 


6402 


ccctccttgagggcgacat 


6968 


6987 




4 


6403 


tcatgctcctctatgcccc 


7505 


7524 


i 


4 


6404 


taccaccacgagcttacgc 


2751 


2770 


1 


4 


6405 


gogOSagccgggggacccc 


7531 


7550 


1 


4 


6406 


cttcgagcg gaaq ggg atq 


7130 


7149 




4 

• 


6407 


cacggcgaccgcccctcct 


7444 


7463 




4 


6408 


actgcacgatgctcgtgaa 


8541 


8560 




4 


6409 


ccgggacgtgcttaaggag 


7804 


7823 




4 


6410 


cgtggaggtcacgcgggtg 


6613 


6632 




4 


6411 


cccotccaataccacctcc 


7317 


7336 




4 


6412 


cccctcctgaccagaoctc 


7455 


7474 




4 


6413 


aggagatgggcggaaacat 


7059 


7078 




4 


6414 


gccgtgatg ggctcctcat 


8166 


8184 




4 


6415 


caagtggcgagctttggag 


5599 


5618 




4 


6416 


tgactaattcaaaagggca 


8409 


8428 




4 
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SEQ ID NO: 5871 
SEQ ID NO: 5872 
SEQ ID NO: 5873 
SEQ ID NO: 5874 
SEQ ID NO: 5875 
SEQ ID NO: 5876 
SEQ ID NO: 5877 
SEQ ID NO: 5878 
SEQ ID NO: 5878 

SEQ ID NO: ^^^^ 
SEQ ID NO: 5881 
SEQ ID NO: 5882 
SEQ ID NO: 5883 
SEQ ID NO: 5884 
SEQ ID NO: 5885 
SEQ ID NO: 5886 
SEQ ID NO: 5887 
SEQ ID NO: 5888 
SEQ ID NO: 5889 
SEQ ID NO: 5890 
SEQ ID NO: 5891 
SEQ ID NO: 5892 
SEQ ID NO: 5893 
SEQ ID NO: 5894 
SEQ ID NO: 5895 
SEQ ID NO: 5896 
SEQ ID NO: ^^^^ 
SEQ ID NO: 5898 
SEQ ID NO: 5899 
SEQ ID NO: 6900 

SEQ ID NO: ^^^^ 
SEQ ID NO: 5902 
SEQ ID NO: 5903 
SEQ ID NO: 5904 
SEQ ID NO: 5905 
SEQ ID NO: 5906 
SEQ ID NO: 5907 
SEQ ID NO: 5908 

otzvj lU iNVj. Ootio 

SEQ ID NO: 5910 
SEQ ID NO: 591 1 
SEQ ID NO: 5912 
SEQ ID NO: 5913 
SEQ ID NO: 5914 
SEQ ID NO: 5915 


accacctccacggagaaaa 


7327 


' 7346 


SEQ ID NO 


: 6417|ttttttccctctttatggt 


9502 


! 9521 


1 


4 


! ccacctccacggagaaaaa 


7328 


\ 7347 


'SEQ ID NO 


: 64ie 


\ ttttccctctttatgg tgg 


950^ 


[ 9522 


\ 1 


4 


1 acctccacggagaaaaagg 


733C 


1 734S 


SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 

SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO; 
SEQ ID NO 
SEQ ID NO: 

SEQ ID NO. 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 

SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 

SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 


: 641 £ 


1 cctttgacagactgcaggt 


777C 


) 778£ 


1 


4 


ggttgtcctgacggactcc 


7351 


737C 


: 642C 


1 ggagctcgctaccaaaacc 


739C 


\ 740C 




/. 


1 cctgaccagacctccgaca 


746C 


1 7479 


: 6421 


tgtcctacacatggacagg 


7617 


' 7636 




4 


agcaagctgcccatcaacg 


7667 


7686 
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7344 


7363 




3 


1 gagtatgtcgtgttgcttt 


2492 


2611 


: 6551 


aaagaccaagctcaaactc 


9202 


9221 




3 


itgtggatgatgctgctgat 


2547 


2566 


; 6652 


atcactgatggcattcaca 


5707 


5726 




3 


ccgaggccgccttagagaa 


2574 


2593 


: 6553 


ttctgattgccatactcgg 


3015 


3034 




3 


p agaacctggtggccctcaa 


2589 


2608 


: 6554 


ttgatatcaccaaacttct 


3000 


3019 




3 


tacatcaagggcaggctgg 


2672 


2691 




ccagatgtacactaatgta 


3637 


3656 




3 


caagggcaggctggtccct 


2677 


269€ 


; 6556 


aOSggtaggcatctacttg 


9355 


9374 


1 


3 


gcatggccgctgctcctgc 


272C 


2738 


: 6557 


gcagtgctcacttccatgc 


6848 


6867 


1 


3 



327 



wo 2004/080406 



PCT/US2004/007070 



SEQ ID NO 
SEQ ID NO 

SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 

SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 

SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 

SEQ ID NO 

SEQ ID NO 
SEQ ID NO 



catggccgctgctcctgct 



6012 



60 1 3 gccgctgctcctgctcctc 



6014ggagatggctgcatcgtgc 



601 Satggctgcatcgtgcggag 



6016ggcgcggtttttgtggglc 



601 Ttcttatcaccagagctgag 



6018 gtgtgggttccccccctca 



6019 tccccccctcaacgtccgg 



6020ctcaacgtccggggaggcc 
6021 



accaaacttctgattgcca 



6022 caaacttctgattgccata 



6023 gg accg ctcatg g tg ctcc 
6024 gaccgctcatggtgctcca 



6025 atgcatgtlagtgcggaaa 



6026 ttatgtccaaatggccttc 



6027ccaaatggccttcatgaga 



6028 ccttcatgagactg ggcgc 



6029ccttgcggtagcagtggag 



6030tgtcgtcttctctgacatg 
6031 



tggggggcagacaccgcgg 



6032ggggggcagacaccgcggc 



2721 



2725 



2779 



2783 



2801 



2887 



2918 



2926 



2933 



3008 



3010 



3032 



3033 



3106 



3139 



3145 



3153 



3241 



3262 



3299 



3300 



6033 gtggggacatcatcctggg 



6034tggggacatcatcctgggc 



6035ggggacatcatcctgggcc 



6036acctgtctccgcccgaagg 



6037tgtctccgcccgaagggga 



6038gggagatactcctggggcc 



6039ctcccaacagacccggggc 



6040 tccaccgcaacacaatctt 
6041 Icacaatctttcctggcgac 



6042 ggctggccggcgccccccg 



6043ccccggggcgcgttccctg 



6044 tccctgacaccatgcacct 



6045ttccggtgcgccggcgggg 



6046 ctcccccaggcctgtctcc 



6047 gggggttgcaaaggcggtg 



6048 tttgtccccgttgagtcca 



6049 ccgtaccgcaaacattcca 



6050 caagtggcccatctacacg 
6051 Icacgctcccactggcagcg 



6052 ccgcatatgcggcccaagg 



6053cgtatatgtctaaagcaca 
6054 1 
6055 i 



gtatatgtctaaagcacat 



ggaccattaccacgggcgc 



6056 Gccccattacgtactccac 

60571 

6058i 



agttccttgccgacggtgg 



gagacggctggagcgcggc 



3321 



3322 



3323 



3343 



3346 



3366 



3439 



3530 



3540 



3671 



3685 



3698 



3762 



3802 



3904 



3926 



3996 



4013 



4028 



4068 



4140 



4141 



4191 



4209 



4236 



4352 



27401SEQ 
2744 SEQ 

SEQ 



2798 



2802 



2820 



SEQ 
SEQ 

SEQ 
2937 SEQ 



2906 



2945 SEQ 
2952 SEQ 



3027 SEQ 



3029 SEQ 
SEQ 



3051 



3052 



SEQ 



3125SEQ 



31 58 SEQ 



3172 



3281 



3340 



3341 



3362SEQ 



3365SEQ 



3385 SEQ 



3458 SEQ 



3549SEQ 



3559 SEQ 



3164SEQ 
SEQ 



3260SEQ 

SEQ 
33 18 SEQ 



3319SEQ 
SEQ 



SEQ 
3342 SEQ 



3690 SEQ 



3704 SEQ 
SEQ 



3717 



3781 



SEQ 

SEQ 
3923 SEQ 



3821 



3945 SEQ 



4015SEQ 



4032 SEQ 



4047 SEQ 



4087SEQ 



4159 SEQ 
4160SEQ 



4210SEQ 
^SEQ 



4228 



4255 SEQ 

4371 SEQ 



ID NO 
ID NO 

ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 

ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 

ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 

ID NO 
ID NO 
ID NO 



6558 


agcagtgctcacttccatg 


6847 


6866 


1 


3 


6559 


gagggccgccacttgcggc 


9160 


9179 


1 


3 


6560 


gcacggcgaccgcccctcc 


7443 


7462 


1 


3 


6561 


ctccaggccaataggccat 


9404 


9423 




3 


6562 


qaccattaccacaqqcacc 


4192 


4211 




3 


6563 


ctcacaqqccqqqacaaqa 


3482 


3501 




3 


6564 


toaia atca ccci caca cac 


5'^42 


5?61 


~~ll 


3 


6565 


Gcaactcataactaaaaoa 


6^61 

1 


6280 




3 


6566 


g g c ct g ttactccattg a g 


8959 


8978 




3 


6567 


tggctctctacgatgtggt 


8130 


8149 




3 


6568 


tatgacacccgctgttttg 


8267 


8286 




3 


6569 


ggagatcctgcggaaglcc 


7171 


7190 




3 


6570 


tggaaactactatgcggtc 


3945 


3964 




3 


6571 


tttctgtaggggtaggcat 


9348 


9367 




3 


6572 


gaagccagacaggctataa 


8354 


8373 


1 


3 


6573 


tctcagcgacgggtcttgg 


7552 


7571 


1 


3 


6574 


gcgctcgtggccttcaagg 


5927 


5946 


1 


3 


6575 


ctccgcccgaaggggaagg 


3349 


3368 


1 


3 


6576 


cataatctacaccacaaca 


7717 


7736 




3 


6577 


ccaccttatcatattccca 


8083 


8102 




3 


6578 


QccQcccaactcQctcccc 


5792 


5811 




3 


6579 


cccatctacacactcccac 


4020 


4039 




3 


6580 


acccatctacacoctccca 


4019 


4038 




3 


6581 


aaccaaaoaatctcccccc 


6913 


6932 




3 


6582 


ccttta a caa a eta ca a a t 


7770 


7789 




3 


6583 


tccccggtcttcacagaca 


3962 


3981 


1 


3 


6584 


ggcccatctacacgctccc 


4018 


4037 


1 


3 


6585 


gcccccccttgagggggag 


7519 


7538 


1 


3 


6586 


aagaggctccaccagtgga 


6215 


6234 


1 


3 


6587 


gtcgtcggagtcgtgtgtg 


6020 


6039 


1 


3 


6588 


cgggttgttgcaaacagcc 


5542 


5561 


1 


3 


6589 


caggtttgtaactccgggg 


4840 


4859 


1 


3 


6590 


aggtcacgcgggtggggga 


6618 


6637 


1 


3 


6591 


ccccgttgagtccatggaa 


3931 


3950 


1 


3 


6692 


ggagacatcgggccaggag 


9111 


9130 


1 


3 


6593 


caccctgcctgggaacccc 


5680 


5699 


1 


3 


6594 


tggagaccttctgggcaaa 


5613 


5632 


1 


3 


6595 


tggattgccaaatctacgg 


8940 


8959 


1 


3 


6596 


cgtgggtaggatcatcttg 


5389 


5408 




3 


6597 


cgctgcttcggctttcgtg 


5815 


5834 




3 


6598 


ccttcaag gtcatg ag egg 


5937 


5956 




3 


6599 


tgtggaagtgtctcatacg 


5163 


5182 




3 


6600 


atgtg g aa gtgtctcatac 


5162 


5181 




3 


6601 


gcgcgtgtcactcaggtcc 


6167 


6186 




3 


6602 


gtgggcccgggagaggggg 


6059 


6078 




3 


6603 


ccacagtcaaggctaaact 


7839 


7858 




3 


6604 


gccgggggaccccgatctc 


7537 


7556 


1 


3 
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SEQ ID NO: 605S 
SEQ ID NO: 606C 
SEQ ID NO: 6061 
SEQ ID NO: 6062 
SEQ ID NO: 606c 
SEQ ID NO: 606^^ 
SEQ ID NO: 606£ 
SEQ ID NO: 6066 
SEQ ID NO: 6067 
SEQ ID NO: 6068 
SEQ ID NO: 606S 
SEQ ID NO: 607C 
SEQ ID NO: 6071 
SEQ ID NO: 6072 
SEQ ID NO: 6073 
SEQ ID NO: 6074 
SEQ ID NO: 6076 
SEQ ID NO: 6076 
SEQ ID NO: 6077 
SEQ ID NO: 6078 
SEQ ID NO: 6079 
SEQ ID NO: ^080 
SEQ ID NO: 6081 
SEQ ID NO: 6082 
SEQ ID NO: ^^^3 
SEQ ID NO: 6084 
SEQ ID NO: 6085 
SEQ ID NO: 6086 
SEQ ID NO: 6087 
SEQ ID NO: 6088 
SEQ ID NO: 6089 
obO ID NO: t)uyu 
<^pn in Kid' fiOQl 

^FO in wo- 6092 
SEQ ID NO: "^^^^j 

ocLj lU NO: DUy*tl 
Qpo iR Mr^- 6095( 


3 caccgctacgcctccagga 


438^ 


440: 


^SEQ ID NC 


>: 660i 


5tcctacacatggacaggtg 


761 < 


9 7631 


3 


1 3 


)tggagagatccccttctac 


445( 


5 447^ 


^SEQ ID NC 


K 660e 


3 gtagcagtgctcacttcca 


684i 


5 686^ 


X 


1 3 


1 agccatccccatcgaagcc 


447i 


449C 


5SEQ ID NC 


K 660/ 


^ggctggttcgttgctggct 


925i 


7 927( 


3 


1 3 


I tccccatcgaagccatcaa 


4482 


4501 


1 SEQ ID NC 


»: 660? 


Utgagggggagccggggga 


752j 


^ 754( 




1 3 


$ ccccatcgaagccatcaag 


448c 


450S 


^SEQ ID NO 


1: 6605 


5 cttgagggggagccggggg 


752C 


3 754f 


5 


t 3 


1 ggcctcggaatcaatgctg 


456E 


4587 


'SEQ ID NO 


1; 661C 


) cagctccgaattgtcgqcc 


741^ 


1 743: 


3 


1 3 


i gtccgtcataccgaccagc 


4B15 


I 463" 


SEQ ID NO 


»: 6611 


1 gctgagggatgtttgggac 


6271 


1 629( 


) 


1 3 


J gtcataccgaccagcggag 


46ie 


I 463S 


5SEQ ID NO 


: 6615 


I ctccattg a g ccacttq a c 


896E 


I 898"/ 




1 3 


^cgggctataccggtgactt 


466£ 


S 4687 


'SEQ ID NO 


: 661c 


^aagtccaagaagttccccg 


718^ 


I- 720G 


\ 1 


3 


ctttgattcagtgatcgac 


468^ 


\ 4702 


5SEQ ID NO 


: 661^ 


^ gtcgagttcctggtaaaag 


821c 


\ 8235 


J 


1 3 


acagtcgacttcagcttgg 


472^ 


i 474c 


5SEQ ID NO 


: 66ie 


iccaaatctacqqqqcctqt 


8947 


' 8966 




3 


cttggaccccaccttcacc 


4738 


\ 4757 


SEQ ID NO 
>SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 

SEQ ID NO 
SEQ ID NO 

SEQ ID NO 

SEQ ID NO 
SEQ ID NO: 
SEQ ID NO. 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 

SEQ ID NO: 

SEQ ID NO: 

SEQ ID NO: 
SEQ ID NO: 

3EQ ID NO: 

3EQ ID NO: 
SEQ ID NO: 
3EQ ID NO: 
5EQ ID NO: 
SEQ ID NO: 

BEQ ID NO: 
5EQ ID NO: 

5EQ ID NO: 


: 66ie 


> ggtgttgagtgacttcaag 


6301 


632C 


) 1 


3 


gagacgacgaccgtgcctxj 


476C 


1 477S 


: 6617 


' ggggacaaccgatcgtctc 


1891 


191C 


1 1 


3 


ggggtaggactggcagggg 


4806 


482S 


: 6618 


ccccccggggacttgcccc 


8657 


' 867e 


\ 1 


3 


gggcatatacaggtttgta . 


4831 


485C 


: 6619 


tacacatggacaggtgccc 


7622 


: 7641 




3 


gggggaacggccctcgggc 


4855 


4874 


; 6620 


gcccctgcacgccttcccc 


6546 


> 6565 




3 


tgacgcgggctgtgcttgg 


4906 


4925 


: 6621 


ccaattgacaccaccgtca 


8009 


8028 




3 


gacgcgggctgtgcttggt 


4907 


4926 


: 6622 


accaattgacaccaccgtc 


8008 


8027 




3 


tgcttggtacgagctcacc 


4918 


4937 


: 6623 


ggtgcggctgttggcagca 


5849 


5868 




3 


tgcccacttcctgtcccag 


5050 


5069 


: 6624 


ctgggcgcgctgacgggca 


3164 


3183 




3 


ggtggcataccaagccaca 


5101 


5120 


: 6625 


tgtgacaccaattgacacc 


8002 


8021 




3 


gggctcaggccccacctcc 


5130 


5149 


: 6626 


ggaggccgcaagccagccc 


8066 


8085 




3 


ccatcgtgggatcaaatgt 


5147 


5166 


• 6627 


acattctggcg ggctatgg 


5892 


5911 




3 


tcatacggctaaaacccac 


5175 


5194 


6628 


gtggccttcaaggtcatga 


5933 


5952 




3 


tgctgtataggctaggggc 


5214 


5233 


6629 


gcccgaaccggacgtagca 


6832 


6851 




3 


ccaaatacatcatggcatg 


5268 


5287 


6630 


catgcctcaggaaacttgg 


9072 


9091 




3 


ggagtcctcgcagctctgg 


5336 


5355 


6631 


ccagctgtctgcgccctcc 


6955 


6974 




3 


gcctgacaacaggcagtgt 


5364 


5383 


6632 


acactccaggccaataggc 


9401 


9420 




3 


agccaccaagcaggcggag 


5557 


5676 


6633 


ctccagttaactcctggct 


8820 


8839 




3 


catgtggaatttcatcagc 


5635 


5654 


6634 


gctgcgccatcacaacatg 


7702 


7721 




3 


ctctatcaccagcccgctc 


5728 


5747 


6635 


gagccgcatgactgcagag 


9565 


9584 




3 


cccagaacaccctcctgtt 


5751 


5770 


6636 


aacatcttgggggggtggg 


5771 


5790 




3 




Of Did 


o/^ol 


6637 


ccaatcgatg a acg gg ga g 


9378 


9397 




3 


"tnnnnnnntnnntanr«/^n 

'■^yyyyyyy'-yyy i-agccg 


/ I 


O/ yo 


6638 


cggcgccaaactattccaa 


6564 


6583 




3 




OO 1 0 


OtJO/ ^ 


6639 


gcccgaaccggacgtagca 


6832 


6851 




3 


[cgtgggcgctggtatcgc 


5829 


5848, 


6640 


gcgagcggcgtgctgacga 


8453 


8472 




3 








ob41 


gccacgacatcccgcagcg 


7727 


7746 




3 


QPO ir\ MO- RHQfi/ 
otW lU INILJ. au»Di 


^gQciguggcagcaiagg 


ITQCO 
OOOO 


bo72{ 


6642 


:;ctagactctttcgagocg 


7111 


7130 




3 


obU ID IMO. ouyri 


ggggcaggggtggctggcg 


5909 


5928 J 


6643 


[jgcccaactcgctcccccc 


5794 


5813 




3 


SEQ ID NO: 6098 


-tggcgcgctcgtggcctt 


5922 


5941 < 


6644 J 


aagggaggccgcaagccag 


8063 


8082 




3 


SEQ ID NO: 60991 


gg cgcgctcgtggccttc 


5923 


5942 J 


6645 < 


^aagggaggccgcaagcca 


8062 


8081 




3 


SEQ ID NO: 6100( 


gagcggcgaggcgccctct 




5969 < 


66462 


agagcgtcgtctgctgctc 


7596 


7615 




3 


SEQ ID NO: 6101 1 


gggcccgggagagggggc 


6060 


6079 < 


66475 


gcccatctacacgctccca 


4019 


4038 




3 


SEQ ID NO: 6102c 


)ggctgatagcgttcgctt 


6095 


6114J 


66482 


aagcaggcggaggctgccg 


5564 






3 


SEQ ID NO: ^^^^i 


^tgcctgagagcgacgccg 


6146 


6165c 


6649c 


^ggccgccgacagcggcac 


7428 


7447 




3 


SEQ ID NO: 6104s 


atgaggactgttctacgcc 


6237 


6256 e 


6650g 


jgcggggggacggcatcat 


6399 


6418 




3 


SEQ ID NO: 6105c 


jtccaagctcctgccgcgg 


6331 


6350 E 


6651c 


scgctccgtgtgggaggac 


7969 


7988 




3 
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SEQIDNO: 6106 
SEQIDNO: 6107 

SEQIDNO: 
SEQIDNO: 61 OS 
SEQIDNO: 61 1C 
SEQIDNO: 6111 
SEQIDNO: 6112 
SEQIDNO: 6113 
SEQIDNO: 6114 
SEQIDNO: 6115 
SEQIDNO: 6116 
SEQIDNO: 6117 
SEQIDNO: 6118 
SEQIDNO: 6119 
SEQ ID NO: 6120 
SEQ ID NO: 6121 
SEQ ID NO: 6122 
SEQ ID NO: 6123 
SEQ ID NO: 6124 
SEQ ID NO: 6125 
SEQ ID NO: 6126 
SEQ ID NO: 6127 
SEQ ID NO: 6128 
SEQ ID NO: 6129( 
SEQ ID NO: 6130i 
SEQIDNO: 6131; 
SEQ ID NO: 6132( 


lacagatcgccggacatgtc 


6442 


6461 


ISEQ ID NO 


: 6652 


! gacatatatcacagcctgt 


9287 


' 93061 1 


3 


'acgtggcatggaacattcc 


650C 


652£ 


>SEQ ID NO 


: 665c 


5 ggaagaacccggactacgt 


7257 


' 727e 


) 1 


3 


gggcccctgcacgccttcc 


6544 


r 6565 


^SEQ ID NO 
SEQ ID NO 
»SEQ ID NO 
:SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
3EQ ID NO: 
3EQ ID NO: 


. 665-^ 


[ ggaagaaagcaagctgccc 


766C 


767C 


) 1 


3 


agtgcccatgtcaggttcc 


667S 


669^ 


: 665£ 


iggaaacagctagacacact 


8802 




} i 

m 1 


3 


tgcccatgtcaggttccag 


6677 


6696 


: 665€ 


^otgggcgcgctgacgggca 


3164 


L 3182 


; 1 

r 1 


3 


cagctcctgagtttttcac 


6693 


6712 


: 6657 


'gtgagagcgtcgtctgctg 


759c 


7612 


' 1 


3 


bacggaggtggatggggt 


6708 


6727 


: 665£ 


> acccttcctcaagccgtga 


8153 


8172 


' 1 


3 


cacggaggtggatggggtg 


6709 


6728 


: 6659 


cacccttcctcaagccgtg 


8152 


8171 


1 
1 


3 


gacccctcccacattacag 


6872 


6891 


: 6660 


ctgttttgactcaacggtc 


8278 


8297 


1 

1 


3 


ttggccagggggtctcccc 


6911 


6930 


: 6661 


ggggtgggtagccgcccaa 


5782 


5801 


1 


3 


ccttgagggcgacatgcac 


697? 


6991 


: 6662 


gtgcttaaggagatgaagg 


7811 


7830 


1 
1 


3 


ggagatgggcggaaacatc 


7060 


7079 


; Dooo 


gatgacccatttcttctcc 


8887 


8906 


1 


3 


gagatgggcggaaacatca 


7061 


7080 


: 66S4 


tgatgacccatttcttctc 


8886 


8905 


1 

1 


3 


ctagactctttcgagccgc 


7112 


7131 


6665 


gcggcgtgctgacgactag 


8457 


8476 


1 

1 


3 


tag actctttcg ag ccg ct 


7113 


7132 


6666 


agcgacgggtcttggtcta 


7556 


7575 


1 


3 


agaatgaaatatccattgc 


7149 


7168 


6667 


gcaaagaatgaggttttct 


8030 


8049 


i 


3 


ttgcggcggagatcctgcg 


7164 


7183 


6668 


cgcacgatgcatctggcaa 


8730 


8749 


1 


3 


agcgaggaggctggtgaga 


7580 


7599 


6669 


tctcgtgcccgaccccgct 


9305 


9324 


1 


3 


tgagagcgtcgtctgctgc 


7594 


7613 


6670 


gcagiaaagaccaagctca 


9197 


9216 


1 


3 


gtcgtctgctgctcaatgt 


7601 


7620 


6671 


acatggtctacgccacgac 


7716 


7735 


1 


3 


tgcgccatcacaacatggt 


7704 


7723 


6672 


accatgtctcccccacgca 


6123 


6142 


1 


3 


cagaagaaggtcacctttg 


7757 


7776 


6673 


caaagaatgaggttttctg 


8031 


8050 


1 


3 


cctggatgaccattaccgg 


7789 


7808 


6674 


ccggaacctatccagcagg 


7936 


7955 


1 


3 


ggacgtgcttaaggagatg 


7807 


7826, 


6675 


catcgggccaggagcgtcc 


9116 


9135 


1 


3 


aaagaatgaggttttctgc 


8032 


8051: 


6676 


gcagaagaaggtcaccttt 


7756 


7775 


1 


3 


agttcgtgtatgcgagaag 


8110 


8129: 


66771 


sttcatgcctcaggaaact 


9069 


9088 


1 


3 


ggctataaaatcgctcaca 


8365 


8384J 


66781 


:gtgaaaggtccgtgagcc 


9551 


9570 


1 


3 


SEQIDNO: 61331 


tetccatccttctagctc 


8900 


891 9< 


6679 


gagcggagggggatgagaa 


7134 


7153 


1 


3 


SEQIDNO: 61341 


gtctcgtgcccgaccccg 


9303 


9322 J 


6680 


'ggggcgcgttccctgaca 


3688 


3707 


1 


3 
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Table 14. Sequences from human hepatitis C virus (HCV^ (Direct Match Type) 





Source 


Start 
Index 


End 
Index 




Match 


Start 
Index 


End 
Index 


Match 

# 


SEQ ID NO: 6755 


ttttttttttttttttttt 


9446 


9465 


SEQ ID NO:6758 


ttttttttttttttttttt 


9466 


9485 


2 


SEQ ID NO: 6756 


ttttttttttttttttttt 


9446 


9465 


SEQ ID NO:6759 


ttttttttttttttttttt 


9465 


9484 


1 


SEQ ID NO: 6757 


ttttttttttttttttttt 


9447 


9466 


SEQ ID NO:6760 


ttttttttttttttttttt 


9466 


9485 


1 



5 Table 15. Sequences of Exemplary Gene Targets 

gi I 4502152 1 ref 1 NM_000384 . 1 1 Homo sapienis apolipoprotein B (including Ag(x) 
antigen) (APOB) , mRNA 

ATTCCCACCGGGACCTGCGGGGCTGAGTGCCCTTCTCGGTTGCTGCCGCTGAGGAGCCCGCCCAGCCAGC 
CAGGGCCGCGAGGCCGAGGCCAGGCCGCAGCCCAGGAGCCGCCCCACCGCAGCTGGCGATGGACCCGCCG 

10 AGGCCCGCGCTGCTGGCGCTGCTGGCGCTGCCTGCGCTGCTGCTGCTGCTGCTGGCGGGCGCCAGGGCCG 
AAGAGGAAATGCTGGAAAATGTCAGCCTGGTCTGTCCAAAAGATGCGACCCGATTCAAGCACCTCCGGAA 
GTACACATACAACTATGAGGCTGAGAGTTCCAGTGGAGTCCCTGGGACTGCTGATTCAAGAAGTGCCACC 
AGGATCAACTGCAAGGTTGAGCTGGAGGTTCCCCAGCTCTGCAGCTTCATCCTGAAGACCAGCCAGTGCA 
CCCTGAAAGAGGTGTATGGCTTCAACCCTGAGGGCA^IAGCCTTGCTGAAGAAAACCAAGAACTCTGAGGA 

15 GTTTGCTGCAGCCATGTCCAGGTATGAGCTCAAGCTGGCCATTCCAGAAGGGAAGCAGGTTTTCCTTTAC 
CCGGAGAT^GATGAACCTACTTACATCCTGAACATCAAGAGGGGCATCATTTCTGCCCTCCTGGTTCCCC 
CAGAGACAGAAGAAGCCAAGCAAGTGTTGTTTCTGGATACCGTGTATGGAAACTGCTCCACTCACTTTAC 
CGTCAAGACGAGGAAGGGCAATGTGGCAACAGAAATATCCACTGAAAGAGACCTGGGGCAGTGTGATCGC 
TTCAAGCCCATCCGCACAGGCATCAGCCCACTTGCTCTCATCAAAGGCATGACCCGCCCCTTGTCAACTC 

20 TGATCAGCAGCAGCCAGTCCTGTCAGTACACACTGGACGCTAAGAGGAAGCATGTGGCAGAAGCCATCTG 
CAAGGAGCAACACCTCTTCCTGCCTTTCTCCTACAACAATAAGTATGGGATGGTAGCACAAGTGACACAG 
ACTTTGAAACTTGAAGACACACCAAAGATCAACAGCCGCTTCTTTGGTGAAGGTACTAAGAAGATGGGCC 
TCGCATTTGAGAGCACCAAATCCACATCACCTCCAAAGCAGGCCGAAGCTGTTTTGAAGACTCTCCAGGA 
ACTGAAAAAACTAACCATCTCTGAGCAAAATATCCAGAGAGCTAATCTCTTCAATAAGCTGGTTACTGAG 

25 CTGAGAGGCCTCAGTGATGAAGCAGTCACATCTCTCTTGCCACAGCTGATTGAGGTGTCCAGCCCCATCA 
CTTTACAAGCCTTGGTTCAGTGTGGACAGCCTCAGTGCTCCACTCACATCCTCCAGTGGCTGAAACGTGT 
GCATGCCAACCCCCTTCTGATAGATGTGGTCACCTACCTGGTGGCCCTGATCCCCGAGCCCTCAGCACAG 
CAGCTGCGAGAGATCTTCT^CATGGCGAGGGATCAGCGCAGCCGAGCCACCTTGTATGCGCTGAGCCACG 
CGGTC7\ACAACTATCATAAGACAAACCCTACAGGGACCCAGGAGCTGCTGGACATTGCTAATTACCTGAT 

30 GGAACAGATTCT^GATGACTGCACTGGGGATGAAGATTACACCTATTTGATTCTGCGGGTCATTGGAAAT 
ATGGGCCAAACCATGGAGCAGTTAACTCCAGAACTCAAGTCTTCAATCCTCAAATGTGTCCAAAGTACT^ 
AGCCATCACTGATGATCCAGAAAGCTGCCATCCAGGCTCTGCGGAAAATGGAGCCTAAAGAC7\AGGACCA 
GGAGGTTCTTCTTCAGACTTTCCTTGATGATGCTTCTCCGGGAGATAAGCGACTGGCTGCCTATCTTATG 
TTGATGAGGAGTCCTTCACAGGCAGATATTAACAAAATTGTCCAAATTCTACCATGGGAACAGAATGAGC 

35 AAGTGAAGAACTTTGTGGCTTCCCATATTGCCAATATCTTGAACTCAGAAGAATTGGATATCCAAGATCT 
GAAAAAGTTAGTGAAAGTVAGCTCTGAAAGAATCTCAACTTCCAACTGTCATGGACTTCAGAAAATTCTCT 
CGGAACTATCT^ACTCTACAAATCTGTTTCTCTTCCATCACTTGACCCAGCCTCAGCCAAAATAGAAGGGA 
ATCTTATATTTGATCCAAATAACTACCTTCCTAAAGAAAGCATGCTGAAAACTACCCTCACTGCCTTTGG 
ATTTGCTTCAGCTGACCTCATCGAGATTGGCTTGGAAGGAAAAGGCTTTGAGCCAACATTGGAAGCTCTT 

40 TTTGGGAAGCT^GGATTTTTCCCAGACAGTGTCAACAAAGCTTTGTACTGGGTTAATGGTCAAGTTCCTG 
ATGGTGTCTCTAAGGTCTTAGTGGACCACTTTGGCTATACCAAAGATGATAAACATGAGCAGGATATGGT 
AAATGGAATAATGCTCAGTGTTGAGAAGCTGATTAAAGATTTGAAATCCAAAGAAGTCCCGGAAGCCAGA 
GCCTACCTCCGCATCTTGGGAGAGGAGCTTGGTTTTGCCAGTCTCCATGACCTCCAGCTCCTGGGAAAGC 
TGCTTCTGATGGGTGCCCGCACTCTGCAGGGGATCCCCCAGATGATTGGAGAGGTCATCAGGAAGGGCTC 
45 AAAGAATGACTTTTTTCTTCACTACATCTTCATGGAGAATGCCTTTGAACTCCCCACTGGAGCTGGATTA 
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CAGTTGCAAATATCTTCATCTGGAGTCATTGCTCCCGGAGCCAAGGCTGGAGTAAAACTGGAAGTAGCCA 

ACATGCAGGCTGAACTGGTGGCAAAACCCTCCGTGTCTGTGGAGTTTGTGACAAATATGGGCATCATCAT 

TCCGGACTTCGCTAGGAGTGGGGTCCAGATGAACACCAACTTCTTCCACGAGTCGGGTCTGGAGGCTCAT 

GTTGCCCTAAAAGCTGGGAAGCTGAAGTTTATCATTCCTTCCCCAAAGAGACCAGTCAAGCTGCTCAGTG 

GAGGCAACACATTACATTTGGTCTCTACCACCAAAACGGAGGTGATCCCACCTCTCATTGAGAACAGGCA 

GTCCTGGTCAGTTTGCAAGCAAGTCTTTCCTGGCCTGAATTACTGCACCTCAGGCGCTTACTCCAACGCC 

AGCTCCACAGACTCCGCCTCCTACTATCCGCTGACCGGGGACACCAGATTAGAGCTGGAACTGAGGCCTA 

CAGGAGAGATTGAGCAGTATTCTGTCAGCGCAACCTATGAGCTCCAGAGAGAGGACAGAGCCTTGGTGGA 

TACCCTGAAGTTTGTAACTCAAGCAGAAGGTGCGAAGCAGACTGAGGCTACCATGACATTCAAATATAAT 

CGGCAGAGTATGACCTTGTCCAGTGAAGTCCAAATTCCGGATTTTGATGTTGACCTCGGAACAATCCTCA 

GAGTTAATGATGAATCTACTGAGGGCAAAACGTCTTACAGACTCACCCTGGACATTCAGAACAAGAAAAT 

TACTGAGGTCGCCCTCATGGGCCACCTAAGTTGTGACACAAAGGAAGAAAGAAAAATCAAGGGTGTTATT 

TCCATACCCCGTTTGCAAGCAGAAGCCAGAAGTGAGATCCTCGCCCACTGGTCGCCTGCCAAACTGCTTC 

TCCAAATGGACTCATCTGCTACAGCTTATGGCTCCACAGTTTCCAAGAGGGTGGCATGGCATTATGATGA 

AGAGAAGATTGAATTTGAATGGAACACAGGCACCAATGTAGATACCAAAAAAATGACTTCCAATTTCCCT 

GTGGATCTCTCCGATTATCCTAAGAGCTTGCATATGTATGCTAATAGACTCCTGGATCACAGAGTCCCTG 

AAACAGACATGACTTTCCGGCACGTGGGTTCCAAATTAATAGTTGCAATGAGCTCATGGCTTCAGAAGGC 

ATCTGGGAGTCTTCCTTATACCCAGACTTTGCAAGACCACCTCAATAGCCTGAAGGAGTTCAACCTCCAG 

AACATGGGATTGCCAGACTTCCACATCCCAGAAAACCTCTTCTTAAAAAGCGATGGCCGGGTCAAATATA 

CCTTGAACAAGAACAGTTTGAAAATTGAGATTCCTTTGCCTTTTGGTGGCAAATCCTCCAGAGATCTAAA 

GATGTTAGAGACTGTTAGGACACCAGCCCTCCACTTCAAGTCTGTGGGATTCCATCTGCCATCTCGAGAG 

TTCCAAGTCCCTACTTTTACCATTCCCAAGTTGTATCAACTGCAAGTGCCTCTCCTGGGTGTTCTAGACC 

TCTCCACGAATGTCTACAGCAACTTGTACAACTGGTCCGCCTCCTACAGTGGTGGCAACACCAGCACAGA 

CCATTTCAGCCTTCGGGCTCGTTACCACATGAAGGCTGACTCTGTGGTTGACCTGCTTTCCTACAATGTG 

CAAGGATCTGGAGAAACAACATATGACCACAAGAATACGTTCACACTATCATGTGATGGGTCTCTACGCC 

ACAAATTTCTAGATTCGAATATCAAATTCAGTCATGTAGAAAAACTTGGAAACAACCCAGTCTCAAAAGG 

TTTACTAATATTCGATGCATCTAGTTCCTGGGGACCACAGATGTCTGCTTCAGTTCATTTGGACTCCAAA 

AAGAAACAGCATTTGTTTGTCAAAGAAGTCAAGATTGATGGGCAGTTCAGAGTCTCTTCGTTCTATGCTA 

AAGGCACATATGGCCTGTCTTGTCAGAGGGATCCTAACACTGGCCGGCTCAATGGAGAGTCCAACCTGAG 

GTTTAACTCCTCCTACCTCCAAGGCACCAACCAGATAACAGGAAGATATGAAGATGGAACCCTCTCCCTC 

ACCTCCACCTCTGATCTGCAAAGTGGCATCATTAAAAATACTGCTTCCCTAAAGTATGAGAACTACGAGC 

TGACTTTAAAATCTGACACCAATGGGAAGTATAAGAACTTTGCCACTTCTAACAAGATGGATATGACCTT 

CTCTAAGCAAAATGCACTGCTGCGTTCTGAATATCAGGCTGATTACGAGTCATTGAGGTTCTTCAGCCTG 

CTTTCTGGATCACTAAATTCCCATGGTCTTGAGTTAAATGCTGACATCTTAGGCACTGACAAAATTAATA 

GTGGTGCTCACAAGGCGACACTAAGGATTGGCCAAGATGGAATATCTACCAGTGCAACGACCAACTTGAA 

GTGTAGTCTCCTGGTGCTGGAGAATGAGCTGAATGCAGAGCTTGGCCTCTCTGGGGCATCTATGAAATTA 

ACAACAAATGGCCGCTTCAGGGAACACAATGCAAAATTCAGTCTGGATGGGAAAGCCGCCCTCACAGAGC 

TATCACTGGGAAGTGCTTATCAGGCCATGATTCTGGGTGTCGACAGCAAAAACATTTTCAACTTCAAGGT 

CAGTCAAGAAGGACTTAAGCTCTCAAATGACATGATGGGCTCATATGCTGAAATGAAATTTGACCACACA 

AACAGTCTGAACATTGCAGGCTTATCACTGGACTTCTCTTCAAAACTTGACAACATTTACAGCTCTGACA 

AGTTTTATAAGCAAACTGTTAATTTACAGCTACAGCCCTATTCTCTGGTAACTACTTTAAACAGTGACCT 

GAAATACAATGCTCTGGATCTCACCAACAATGGGAAACTACGGCTAGAACCCCTGAAGCTGCATGTGGCT 

GGTAACCTAAAAGGAGCCTACCAAAATAATGAAATAAAACACATCTATGCCATCTCTTCTGCTGCCTTAT 

CAGCAAGCTATAAAGCAGACACTGTTGCTAAGGTTCAGGGTGTGGAGTTTAGCCATCGGCTCAACACAGA 

CATCGCTGGGCTGGCTTCAGCCATTGACATGAGCACAAACTATAATTCAGACTCACTGCATTTCAGCAAT 

GTCTTCCGTTCTGTAATGGCCCCGTTTACCATGACCATCGATGCACATACAAATGGCAATGGGAAACTCG 

CTCTCTGGGGAGAACATACTGGGCAGCTGTATAGCAAATTCCTGTTGAAAGCAGAACCTCTGGCATTTAC 

TTTCTCTCATGATTACAAAGGCTCCACAAGTCATCATCTCGTGTCTAGGAAAAGCATCAGTGCAGCTCTT 

GAACACAAAGTCAGTGCCCTGCTTACTCCAGCTGAGCAGACAGGCACCTGGAAACTCAAGACCCAATTTA 

ACAACAATGAATACAGCCAGGACTTGGATGCTTACAACACTAAAGATAAAATTGGCGTGGAGCTTACTGG 

ACGAACTCTGGCTGACCTAACTCTACTAGACTCCCCAATTAAAGTGCCACTTTTACTCAGTGAGCCCATC 

AATATCATTGATGCTTTAGAGATGAGAGATGCCGTTGAGAAGCCCCAAGAATTTACAATTGTTGCTTTTG 

TAAAGTATGATAAAAACCAAGATGTTCACTCCATTAACCTCCCATTTTTTGAGACCTTGCAAGAATATTT 

TGAGAGGAATCGACAAACCATTATAGTTGTAGTGGAAAACGTACAGAGAAACCTGAAGCACATCAATATT 

GATCAATTTGTAAGAAAATACAGAGCAGCCCTGGGAAAACTCCCACAGCAAGCTAATGATTATCTGAATT 

CATTCAATTGGGAGAGACAAGTTTCACATGCCAAGGAGAAACTGACTGCTCTCAC7y\AAAAGTATAGAAT 

TACAGAAAATGATATACAAATTGCATTAGATGATGCCAAAATCAACTTTAATGAAAAACTATCTCAACTG 
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CAGACATATATGATACAATTTGATCAGTATATTAAAGATAGTTATGATTTACATGATTTGAAAATAGCTA 
TTGCTAATATTATTGATGAAATCATTGAAAAATTAAAAAGTCTTGATGAGCACTATCATATCCGTGTAAA 
TTTAGTAAAAACAATCCATGATCTACATTTGTTTATTGAAAATATTGATTTTAACAAAAGTGGAAGTAGT 
AC T GC AT C CT G GAT T C AAAAT G T G GAT AC T AAG T ACC AAAT C AG AAT C C AG AT AC AAG AAAAAC T G CAG C 
5 AGCT T AAGAGACAC AT ACAGAAT AT AGAC AT CCAGCACCTAGC TGGAAAGT T AAAAC AAC AC AT T GAGG C 
TATTGATGTTAGAGTGCTTTTAGATCAATTGGGAACTACAATTTCATTTGAAAGAATAAATGATGTTCTT 
GAGCATGTCAAACACTTTGTTATAAATCTTATTGGGGATTTTGAAGTAGCTGAGAAAATCAATGCCTTCA 
GAGCCAAAGTCCATGAGTTAATCGAGAGGTATGAAGTAGACCAACAAATCCAGGTTTTAATGGATAAATT 
AGTAGAGTTGACCCACCAATACAAGTTGAAGGAGACTATTCAGAAGCTAAGCAATGTCCTACAACAAGTT 

10 AAGATAAAAGATTACTTTGAGAAATTGGTTGGATTTATTGATGATGCTGTGAAGAAGCTTAATGAATTAT 
C T T T T AAAACAT T CAT T G AAGAT G T T AA CAAAT T C C T T GAC AT G T T G AT AAAGAAAT T AAAG T C AT T T G A 
TTACCACCAGTTTGTAGATGAAACCAATGACAAAATCCGTGAGGTGACTCAGAGACTCAATGGTGAAATT 
CAGGCTCTGGAACTACCACAAAAAGCTGAAGCATTAAAACTGTTTTTAGAGGAAACCAAGGCCACAGTTG 
CAGTGTATCTGGAAAGCCTACAGGACACCAAAATAACCTTAATCATCAATTGGTTACAGGAGGCTTTAAG 

15 TTCAGCATCTTTGGCTCACATGAAGGCCAAATTCCGAGAGACTCTAGAAGATACACGAGACCGAATGTAT 
CAAATGGACATTCAGCAGGAACTTCAACGATACCTGTCTCTGGTAGGCCAGGTTTATAGCACACTTGTCA 
CCTACATTTCTGATTGGTGGACTCTTGCTGCTAAGAACCTTACTGACTTTGCAGAGCAATATTCTATCCA 
AGATTGGGCTAAACGTATGAAAGCATTGGTAGAGCAAGGGTTCACTGTTCCTGAAATCAAGACCATCCTT 
GGGACCATGCCTGCCTTTGAAGTCAGTCTTCAGGCTCTTCAGAAAGCTACCTTCCAGACACCTGATTTTA 

20 TAGTCCCCCTAACAGATTTGAGGATTCCATCAGTTCAGATAAACTTCAAAGACTTAAAAAATATAAAAAT 
CCCATCCAGGTTTTCCACACCAGAATTTACCATCCTTAACACCTTCCACATTCCTTCCTTTACAATTGAC 
* TTTGTCGAAATGAAAGTAAAGATCATCAGAACCATTGACCAGATGCAGAACAGTGAGCTGCAGTGGCCCG 
TTCCAGATATATATCTCAGGGATCTGAAGGTGGAGGACATTCCTCTAGCGAGAATCACCCTGCCAGACTT 
CCGTTTACCAGAAATCGCAATTCCAGAATTCATAATCCCAACTCTCAACCTTAATGATTTTCAAGTTCCT 

25 GACCTTCACATACCAGAATTCCAGCTTCCCCACATCTCACACACAATTGAAGTACCTACTTTTGGCAAGC 
TATACAGTATTCTGAAAATCCAATCTCCTCTTTTCACATTAGATGCAAATGCTGACATAGGGAATGGAAC 
CACCTCAGCAAACGAAGCAGGTATCGCAGCTTCCATCACTGCCAAAGGAGAGTCCAAATTAGAAGTTCTC 
AATTTTGATTTTCAAGCAAATGCACAACTCTCAAACCCTAAGATTAATCCGCTGGCTCTGAAGGAGTCAG 
TGAAGTTCTCCAGCAAGTACCTGAGAACGGAGCATGGGAGTGAAATGCTGTTTTTTGGAAATGCTATTGA 

30 GGGAAAATCAAACACAGTGGCAAGTTTACACACAGAAAAAAATACACTGGAGCTTAGTAATGGAGTGATT 
GTCAAGATAAACAATCAGCTTACCCTGGATAGCAACACTAAATACTTCCACAAATTGAACATCCCCAAAC 
TGGACTTCTCTAGTCAGGCTGACCTGCGCAACGAGATCAAGACACTGTTGAAAGCTGGCCACATAGCATG 
GACTTCTTCTGGAAAAGGGTCATGGAAATGGGCCTGCCCCAGATTCTCAGATGAGGGAACACATGAATCA 
CAAATTAGTTTCACCATAGAAGGACCCCTCACTTCCTTTGGACTGTCCAATAAGATCAATAGCAAACACC 

35 TAAGAGTAAACCAAAACTTGGTTTATGAATCTGGCTCCCTCAACTTTTCTAAACTTGAAATTCAATCACA 
AGTCGATTCCCAGCATGTGGGCCACAGTGTTCTAACTGCTAAAGGCATGGCACTGTTTGGAGAAGGGAAG 
GCAGAGTTTACTGGGAGGCATGATGCTCATTTAAATGGAAAGGTTATTGGAACTTTGAAAAATTCTCTTT 
TCTTTTCAGCCCAGCCATTTGAGATCACGGCATCCACAAACAATGAAGGGAATTTGAAAGTTCGTTTTCC 
ATTAAGGTTAACAGGGAAGATAGACTTCCTGAATAACTATGCACTGTTTCTGAGTCCCAGTGCCCAGCAA 

40 GCAAGTTGGCAAGTAAGTGCTAGGTTCAATCAGTATAAGTACAACCAAAATTTCTCTGCTGGAAACAACG 
AGAACATTATGGAGGCCCATGTAGGAATAAATGGAGAAGCAAATCTGGATTTCTTAAACATTCCTTTAAC 
AATTCCTGAAATGCGTCTACCTTACACAATAATCACAACTCCTCCACTGAAAGATTTCTCTCTATGGGAA 
AAAAC AGG C T T GAAGG AAT T C T T G AAAAC GAC AAAG C AAT CAT T T GAT T T AAG T G T7VAAAGC T CAG TATA 
AGAAAAACAAACACAGGCATTCCATCACAAATCCTTTGGCTGTGCTTTGTGAGTTTATCAGTCAGAGCAT 

45 CAAATCCTTTGACAGGCATTTTGAAAAAAACAGAAACAATGCATTAGATTTTGTCACCAAATCCTATAAT 
GAAACAAAAATTAAGTTTGATAAGTACAAAGCTGAAAAATCTCACGACGAGCTCCCCAGGACCTTTCAAA 
TTCCTGGATACACTGTTCCAGTTGTCAATGTTGAAGTGTCTCCATTCACCATAGAGATGTCGGCATTCGG 
CTATGTGTTCCCAAAAGCAGTCAGCATGCCTAGTTTCTCCATCCTAGGTTCTGACGTCCGTGTGCCTTCA 
TACACATTAATCCTGCCATCATTAGAGCTGCCAGTCCTTCATGTCCCTAGAAATCTCAAGCTTTCTCTTC 

50 CACATTTCAAGGAATTGTGTACCATAAGCCATATTTTTATTCCTGCCATGGGCAATATTACCTATGATTT 
CTCCTTTAAATCAAGTGTCATCACACTGAATACCAATGCTGAACTTTTTAACCAGTCAGATATTGTTGCT 
CATCTCCTTTCTTCATCTTCATCTGTCATTGATGCACTGCAGTACAAATTAGAGGGCACCACAAGATTGA 
CAAGAAAAAGGGGATTGAAGTTAGCCACAGCTCTGTCTCTGAGCAACTIAATTTGTGGAGGGTAGTCATAA 
CAGTACTGTGAGCTTAACCACGAAAAATATGGAAGTGTCAGTGGCAAAAACCACAAAAGCCGAAATTCCA 

55 ATTTTGAGAATGAATTTCAAGCAAGAACTTAATGGAAATACCAAGTCAAAACCTACTGTCTCTTCCTCCA 
TGGAATTTAAGTATGATTTCAATTCTTCAATGCTGTACTCTACCGCTAAAGGAGCAGTTGACCACAAGCT 
TAGCTTGGAAAGCCTCACCTCTTACTTTTCCATTGAGTCATCTACCAAAGGAGATGTCAAGGGTTCGGTT 
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CTTTCTCGGGAATATTCAGGAACTATTGCTAGTGAGGCCAACACTTACTTGAATTCCAAGAGCACACGGT 
CTTCAGTGAAGCTGCAGGGCACTTCCAAAATTGATGATATCTGGAACCTTGAAGTAAAAGAAAATTTTGC 
TGGAGAAGCCACACTCCAACGCATATATTCCCTCTGGGAGCACAGTACGAAAAACCACTTACAGCTAGAG 
GGCCTCTTTTTCACCAACGGAGAACATACAAGCAAAGCCACCCTGGAACTCTCTCCATGGCAAATGTCAG 
5 CTCTTGTTCAGGTCCATGCAAGTCAGCCCAGTTCCTTCCATGATTTCCCTGACCTTGGCCAGGAAGTGGC 
CCTGAATGCTAACACTAAGAACCAGAAGATCAGATGGAAAAATGAAGTCCGGATTCATTCTGGGTCTTTC 
CAGAGCCAGGTCGAGCTTTCCAATGACCAAGAAAAGGCACACCTTGACATTGCAGGATCCTTAGAAGGAC 
ACCTAAGGTTCCTCAAAAATATCATCCTACCAGTCTATGACAAGAGCTTATGGGATTTCCTAAAGCTGGA 
TGTAACCACCAGCATTGGTAGGAGACAGCATCTTCGTGTTTCAACTGCCTTTGTGTACACCAAAAACCCC 

10 AATGGCTATTCATTCTCCATCCCTGTAAAAGTTTTGGCTGATAAATTCATTACTCCTGGGCTGAAACTAA 
ATGATCTAAATTCAGTTCTTGTCATGCCTACGTTCCATGTCCCATTTACAGATCTTCAGGTTCCATCGTG 
CAAACTTGACTTCAGAGAAATACAAATCTATAAGAAGCTGAGAACTTCATCATTTGCCCTCAACCTACCA 
ACACTCCCCGAGGTAAAATTCCCTGAAGTTGATGTGTTAACAAAATATTCTCAACCAGAAGACTCCTTGA 
TTCCCTTTTTTGAGATAACCGTGCCTGAATCTCAGTTAACTGTGTCCCAGTTCACGCTTCCAAAAAGTGT 

15 TTCAGATGGCATTGCTGCTTTGGATCTAAATGCAGTAGCCAACAAGATCGCAGACTTTGAGTTGCCCACC 
ATCATCGTGCCTGAGCAGACCATTGAGATTCCCTCCATTAAGTTCTCTGTACCTGCTGGAATTGTCATTC 
CTTCCTTTCAAGCACTGACTGCACGCTTTGAGGTAGACTCTCCCGTGTATAATGCCACTTGGAGTGCCAG 
TTTGAAAAACAAAGCAGATTATGTTGAAACAGTCCTGGATTCCACATGCAGCTCAACCGTACAGTTCCTA 
GAATATGAACTAAATGTTTTGGGAACACACAAAATCGAAGATGGTACGTTAGCCTCTAAGACTAAAGGAA 

20 CACTTGCACACCGTGACTTCAGTGCAGAATATGAAGAAGATGGCAAATTTGAAGGACTTCAGGAATGGGA 
AGGAAAAGCGCACCTCAATATCAAAAGCCCAGCGTTCACCGATCTCCATCTGCGCTACCAGAAAGACAAG 
AAAGGCATCTCCACCTCAGCAGCCTCCCCAGCCGTAGGCACCGTGGGCATGGATATGGATGAAGATGACG 
ACTTTTCTAAATGGAACTTCTACTACAGCCCTCAGTCCTCTCCAGATAAAAAACTCACCATATTCAAAAC 
TGAGTTGAGGGTCCGGGAATCTGATGAGGAAACTCAGATCAAAGTTAATTGGGAAGAAGAGGCAGCTTCT 

25 GGCTTGCTAACCTCTCTGAAAGACAACGTGCCCAAGGCCACAGGGGTCCTTTATGATTATGTCAACAAGT 
ACCACTGGGAACACACAGGGCTCACCCTGAGAGAAGTGTCTTCAAAGCTGAGAAGAAATCTGCAGAACAA 
TGCTGAGTGGGTTTATCAAGGGGCCATTAGGCAAATTGATGATATCGACGTGAGGTTCCAGAAAGCAGCC 
AGTGGCACCACTGGGACCTACCAAGAGTGGAAGGACAAGGCCCAGAATCTGTACCAGGAACTGTTGACTC 
AGGAAGGCCAAGCCAGTTTCCAGGGACTCAAGGATAACGTGTTTGATGGCTTGGTACGAGTTACTCAAAA 

30 ATTCCATATGAAAGTCAAGCATCTGATTGACTCACTCATTGATTTTCTGAACTTCCCCAGATTCCAGTTT 
CCGGGGAAACCTGGGATATACACTAGGGAGGAACTTTGCACTATGTTCATAAGGGAGGTAGGGACGGTAC 
TGTCCCAGGTATATTCGAAAGTCCATAATGGTTCAGAAATACTGTTTTCCTATTTCCAAGACCTAGTGAT 
TACACTTCCTTTCGAGTTAAGGAAACATAAACTAATAGATGTAATCTCGATGTATAGGGAACTGTTGAAA 
GATTTATCAAAAGAAGCCCAAGAGGTATTTAAAGCCATTCAGTCTCTCAAGACCACAGAGGTGCTACGTA 

35 ATCTTCAGGACCTTTTACAATTCATTTTCCAACTAATAGAAGATAACATTAAACAGCTGAAAGAGATGAA 
ATTTACTTATCTTATTAATTATATCCAAGATGAGATCAACACAATCTTCAATGATTATATCCCATATGTT 
TTTAAATTGTTGAAAGAAAACCTATGCCTTAATCTTCATAAGTTCAATGAATTTATTCAAAACGAGCTTC 
AGGAAGCTTCTCAAGAGTTACAGCAGATCCATCAATACATTATGGCCCTTCGTGAAGAATATTTTGATCC 
AAGTATAGTTGGCTGGACAGTGAAATATTATGAACTTGAAGAAAAGATAGTCAGTCTGATCAAGAACCTG 

40 TTAGTTGCTCTTAAGGACTTCCATTCTGAATATATTGTCAGTGCCTCTAACTTTACTTCCCAACTCTCAA 
GTCAAGTTGAGCAATTTCTGCACAGAAATATTCAGGAATATCTTAGCATCCTTACCGATCCAGATGGAAA 
AGGGAAAGAGAAGATTGCAGAGCTTTCTGCCACTGCTCAGGAAATAATTAAAAGCCAGGCCATTGCGACG 
AAGAAAATAATTTCTGA.TTACCACCAGCAGTTTAGATATAAACTGCAAGATTTTTCAGACCAACTCTCTG 
ATTACTATGAAAAATTTATTGCTGAATCCAAAAGATTGATTGACCTGTCCATTCAAAACTACCACACATT 

45 TCTGATATACATCACGGAGTTACTGAAAAAGCTGCAATCAACCACAGTCATGAACCCCTACATGAAGCTT 
GCTCCAGGAGAACTTACTATCATCCTCTAATTTTTTAAAAGAAATCTTCATTTATTCTTCTTTTCCAATT 
GAACTTTCACATAGCACAGAAAAAATTCAAACTGCCTATATTGATAAAACCATACAGTGAGCCAGCCTTG 
CAGTAGGCAGTAGACTATAAGCAGAAGCACATATGAACTGGACCTGCACCAAAGCTGGCACCAGGGCTCG 
GAAGGTCTCTGAACTCAGAAGGATGGCATTTTTTGCAAGTTAAAGAAAATCAGGATCTGAGTTATTTTGC 

50 TAAACTTGGGGGAGGAGGAACAAATAAATGGAGTCTTTATTGTGTATCATA (SEQ ID NO: 6681) 



>gi I 4557442 Iref 1NM_000078 . 1 1 Homo sapiens cholesteryl ester transfer 
protein, plasma (CETP), mRNA 

GTGAATCTCTGGGGCCAGGAAGACCCTGCTGCCCGGAAGAGCCTCATGTTCCGTGGGGGCTGGGCGGACA 
55 TACATATACGGGCTCCAGGCTGAACGGCTCGGGCCACTTACACACCACTGCCTGATAACCATGCTGGCTG 
CCACAGTCCTGACCCTGGCCCTGCTGGGCAATGCCCATGCCTGCTCCAAAGGCACCTCGCACGAGGCAGG 
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CATCGTGTGCCGCATCACCAAGCCTGCCCTCCTGGTGTTGAACCACGAGACTGCCAAGGTGATCCAGACC 
GCCTTCCAGCGAGCCAGCTACCCAGATATCACGGGCGAGAAGGCCATGATGCTCCTTGGCCAAGTCAAGT 
ATGGGTTGCACAACATCCAGATCAGCCACTTGTCCATCGCCAGCAGCCAGGTGGAGCTGGTGGAAGCCAA 
GTCCATTGATGTCTCCATTCAGAACGTGTCTGTGGTCTTCAAGGGGACCCTGAAGTATGGCTACACCACT 
5 GCCTGGTGGCTGGGTATTGATCAGTCCATTGACTTCGAGATCGACTCTGCCATTGACCTCCAGATCAACA 
CACAGCTGACCTGTGACTCTGGTAGAGTGCGGACCGATGCCCCTGACTGCTACCTGTCTTTCCATAAGCT 
GCTCCTGCATCTCCAAGGGGAGCGAGAGCCTGGGTGGATCAAGCAGCTGTTCACAAATTTCATCTCCTTC 
ACCCTGAAGCTGGTCCTGAAGGGACAGATCTGCAAAGAGATCAACGTCATCTCTAACATCATGGCCGATT 
TTGTCCAGACAAGGGCTGCCAGCATCCTTTCAGATGGAGACATTGGGGTGGACATTTCCCTGACAGGTGA 

10 TCCCGTCATCACAGCCTCCTACCTGGAGTCCCATCACAAGGGTCATTTCATCTACAAGAATGTCTCAGAG 
GACCTCCCCCTCCCCACCTTCTCGCCCACACTGCTGGGGGACTCCCGCATGCTGTACTTCTGGTTCTCTG 
AGCGAGTCTTCCACTCGCTGGCCAAGGTAGCTTTCCAGGATGGCCGCCTCATGCTCAGCCTGATGGGAGA 
CGAGTTCAAGGCAGTGCTGGAGACCTGGGGCTTCAACACCAACCAGGAAATCTTCCAAGAGGTTGTCGGC 
GGCTTCCCCAGCCAGGCCCAAGTCACCGTCCACTGCCTCAAGATGCCCAAGATCTCCTGCCAAAACAAGG 

15 GAGTCGTGGTCAATTCTTCAGTGATGGTGAAATTCCTCTTTCCACGCCCAGACCAGCAACATTCTGTAGC 
TTACACATTTGAAGAGGATATCGTGACTACCGTCCAGGCCTCCTATTCTAAGAAAAAGCTCTTCTTAAGC 
CTCTTGGATTTCCAGATTACACCAAAGACTGTTTCCAACTTGACTGAGAGCAGCTCCGAGTCCATCCAGA 
GCTTCCTGCAGTCAATGATCACCGCTGTGGGCATCCCTGAGGTCATGTCTCGGCTCGAGGTAGTGTTTAC 
AGCCCTCATGAACAGCAAAGGCGTGAGCCTCTTCGACATCATCAACCCTGAGATTATCACTCGAGATGGC 

20 TTCCTGCTGCTGCAGATGGACTTTGGCTTCCCTGAGCACCTGCTGGTGGATTTCCTCCAGAGCTTGAGCT 
AGAAGTCTCCAAGGAGGTCGGGATGGGGCTTGTAGCAGAAGGCAAGCACCAGGCTCACAGCTGGAACCCT 
GGTGTCTCCTCCAGCGTGGTGGAAGTTGGGTTAGGAGTACGGAGATGGAGATTGGCTCCCAACTCCTCCC 
TATCCTAAAGGCCCACTGGCATTAAAGTGCTGTATCCAAG (SEQ ID NO: 6682) 



25 

>gi I 414 668 I emb 1X75500,1 1 HSMTP H. sapiens mRNA for microsomal triglyceride 
transfer protein 

TGCAGTTGAGGATTGCTGGTCAATATGATTCTTCTTGCTGTGCTTTTTCTCTGCTTCATTTCCTCATATT 
CAGCTTCTGTTAAAGGTCACACAACTGGTCTCTCATTAAATAATGACCGGCTGTACAAGCTCACGTACTC 

30 CACTGAAGTTCTTCTTGATCGGGGCAAAGGAAAACTGCAAGACAGCGTGGGCTACCGCATTTCCTCCAAC 
GTGGATGTGGCCTTACTATGGAGGAATCCTGATGGTGATGATGACCAGTTGATCCAAATAACGATGAAGG 
ATGTAAATGTTGAAAATGTGAATCAGCAGAGAGGAGAGAAGAGCATCTTCAAAGGAAAAAGCCCATCTAA 
AATAATGGGAAAGGAAAACTTGGAAGCTCTGCAAAGACCTACGCTCCTTCATCTAATCCATGGAAAGGTC 
AAAGAGTTCTACTCATATCAAAATGAGGCAGTGGCCATAGAAAATATCAAGAGAGGTCTGGCTAGCCTAT 

35 TTCAGACACAGTTAAGCTCTGGAACCACCAATGAGGTAGATATCTCTGGAAATTGTAAAGTGACCTACCA 
GGCTCATCAAGACAAAGTGATCAAAATTAAGGCCTTGGATTCATGCAAAATAGCGAGGTCTGGATTTACG 
ACCCCAAATCAGGTCTTGGGTGTCAGTTCAAAAGCTACATCTGTCACCACCTATAAGATAGAAGACAGCT 
TTGTTATAGCTGTGCTTGCTGAAGAAACACACAATTTTGGACTGAATTTCCTACAAACCATTAAGGGGAA 
AATAGTATCGAAGCAGAAATTAGAGCTGAAGACAACCGAAGCAGGCCCAAGATTGATGTCTGGAAAGCAG 

40 GCTGCAGCCATAATCAAAGCAGTTGATTCAAAGTACACGGCCATTCCCATTGTGGGGCAGGTCTTCCAGA 
GCCACTGTAAAGGATGTCCTTCTCTCTCGGAGCTCTGGCGGTCCACCAGGAAATACCTGCAGCCTGACAA 
CCTTTCCAAGGCTGAGGCTGTCAGAAACTTCCTGGCCTTCATTCAGCACCTCAGGACTGCGAAGAAAGAA 
GAGATCCTTCAAATACTAAAGATGGAAAATAAGGAAGTATTACCTCAGCTGGTGGATGCTGTCACCTCTG 
CTCAGACCTCAGACTCATTAGAAGCCATTTTGGACTTTTTGGATTTCAAAAGTGACAGCAGCATTATCCT 

45 CCAGGAGAGGTTTCTCTATGCCTGTGGATTTGCTTCTCATCCCAATGAAGAACTCCTGAGAGCCCTCATT 
AGTAAGTTCAAAGGTTCTATTGGTAGCAGTGACATCAGAGAAACTGTTATGATCATCACTGGGACACTTG 
TCAGAAAGTTGTGTCAGAATGAAGGCTGCAAACTCAAAGCAGTAGTGGAAGCTAAGAAGTTAATCCTGGG 
AGGACTTGAAAAAGCAGAGAAAAAAGAGGACACCAGGATGTATCTGCTGGCTTTGAAGAATGCCCTGCTT 
CCAGAAGGCATCCCAAGTCTTCTGAAGTATGCAGAAGCAGGAGT^AGGGCCCATCAGCCACCTGGCTACCA 

50 CTGCTCTCCAGAGATATGATCTCCCTTTCATAACTGATGAGGTGAAGAAGACCTTAAACAGAATATACCA 
CCAAAACCGTAAAGTTCATGAAAAGACTGTGCGCACTGCTGCAGCTGCTATCATTTTAAATAACAATCCA 
TCCTACATGGACGTCAAGAACATCCTGCTGTCTATTGGGGAGCTTCCCCAAGAAATGAATAAATACATGC 
TCGCCATTGTTCAAGACATCCTACGTTTTGAAATGCCTGCAAGCAAAATTGTCCGTCGAGTTCTGAAGGA 
AATGGTCGCTCACAATTATGACCGTTTCTCCAGGAGTGGATCTTCTTCTGCCTACACTGGCTACATAGAA 

55 CGTAGTCCCCGTTCGGCATCTACTTACAGCCTAGACATTCTCTACTCGGGTTCTGGCATTCTAAGGAGAA 
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GTAACCTGAACATCTTTCAGTACATTGGGAAGGCTGGTCTTCACGGTAGCCAGGTGGTTATTGAAGCCCA 
AGGACTGG7\AGCCTTAATCGCAGCCACCCCTGACGAGGGGGAGGAGAACCTTGACTCCTATGCTGGTATG 
TCAGCCATCCTCTTTGATGTTCAGCTCAGACCTGTCACCTTTTTCAACGGATACAGTGATTTGATGTCCA 
AAATGCTGTCAGCATCTGGCGACCCTATCAGTGTGGTGAAAGGACTTATTCTGCTAATAGATCATTCTCA 
5 GGAACTTCAGTTACAATCTGGACTAAAAGCCAATATAGAGGTCCAGGGTGGTCTAGCTATTGATATTTCA 
GGTGCAATGGAGTTTAGCTTGTGGTATCGTGAGTCTAAAACCCGAGTGAAAAATAGGGTGACTGTGGTAA 
TAACCACTGACATCACAGTGGACTCCTCTTTTGTGAAAGCTGGCCTGGAAACCAGTACAGAAACAGAAGC 
AGGCTTGGAGTTTATCTCCACAGTGCAGTTTTCTCAGTACCCATTCTTAGTTTGCATGCAGATGGACAAG 
GATGAAGCTCCATTCAGGCAATTTGAGAAAAAGTACGAAAGGCTGTCCACAGGCAGAGGTTATGTCTCTC 

10 AGAAAAGAAAAGAAAGCGTATTAGCAGGATGTGAATTCCCGCTCCATCAAGAGAACTCAGAGATGTGCAA 
AGTGGTGTTTGCCCCTCAGCCGGATAGTACTTCCAGCGGATGGTTTTGAAACTGACCTGTGATATTTTAC 
TTGAATTTGTCTCCCCGAAAGGGACACAATGTGGCATGACTAAGTACTTGCTCTCTGAGAGCACAGCGTT 
TACATATTTACCTGTATTTAAGATTTTTGTAAAAAGCTACAAAAAACTGCAGTTTGATCAAATTTGGGTA 
TATGCAGTATGCTACCCACAGCGTCATTTTGAATCATCATGTGACGCTTTCAACAACGTTCTTAGTTTAC 

15 TTATACCTCTCTCAAATCTCATTTGGTACAGTCAGAATAGTTATTCTCTAAGAGGAAACTAGTGTTTGTT 
AAAAAC AAAAAT AAAAAC AAAAC C ACAC AAG GAG AACC C AAT T T T G T T T C AAC AAT T T T T G AT C AAT G T A 
TATGAAGCTCTTGATAGGACTTCCTTAAGCATGACGGGAAAACCAAACACGTTCCCTAATCAGGAAAAAA 
AAAAAAAAAAAAAAGTAAGACACAAACAAACCATTTTTTTCTCTTTTTTTGGAGTTGGGGGCCCAGGGAG 
AAGGGACAAGGCTTTTAAAAGACTTGTTAGCCAACTTCAAGAATTAATATTTATGTCTCTGTTATTGTTA 

20 GTTTTAAGCCTTAAGGTAGAAGGCACATAGAAATAACATC (SEQ ID NO: 6683) 



>gi I 1217 638 1 emb 1 X91148 . 1 1 HSMTTP H, sapiens mRNA for microsomal triglyceride 
transfer protein 

TGCAGTTGAGGATTGCTGGTCAATATGATTCTTCTTGCTGTGCTTTTTCTCTGCTTCATTTCCTCATATT 

25 CAGCTTCTGTTAAAGGTCACACAACTGGTCTCTCATTAAATAATGACCGGCTGTACAAGCTCACGTACTC 
CACTGAAGTTCTTCTTGATCGGGGCAAAGGAAAACTGCAAGACAGCGTGGGCTACCGCATTTCCTCCAAC 
GTGGATGTGGCCTTACTATGGAGGAATCCTGATGGTGATGATGACCAGTTGATCCAAATAACGATGAAGG 
ATGTAAATGTTGAAAATGTGAATCAGCAGAGAGGAGAGAAGAGCATCTTCAAAGGAAAAAGCCCATCTAA 
AATAATGGGAAAGGAAAACTTGGAAGCTCTGCAAAGACCTACGCTCCTTCATCTAATCCATGGAAAGGTC 

30 AAAGAGTTCTACTCATATCAAAATGAGGCAGTGGCCATAGAAAATATCAAGAGAGGTCTGGCTAGCCTAT 
TTCAGACACAGTTAAGCTCTGGAACCACCAATGAGGTAGATATCTCTGGAAATTGTAAAGTGACCTACCA 
GGCTCATCAAGACAAAGTGATCAAAATTAAGGCCTTGGATTCATGCAAAATAGCGAGGTCTGGATTTACG 
ACCCCAAATCAGGTCTTGGGTGTCAGTTCAAAAGCTACATCTGTCACCACCTATAAGATAGAAGACAGCT 
TTGTTATAGCTGTGCTTGCTGAAGAAACACACAATTTTGGACTGAATTTCCTACAAACCATTAAGGGGAA 

35 AATAGTATCGAAGCAGAAATTAGAGCTGAAGACAACCGAAGCAGGCCCAAGATTGATGTCTGGAAAGCAG 
GCTGCAGCCATAATCAAAGCAGTTGATTCAAAGTACACGGCCATTCCCATTGTGGGGCAGGTCTTCCAGA 
GCCACTGTAAAGGATGTCCTTCTCTCTCGGAGCTCTGGCGGTCCACCAGGAAATACCTGCAGCCTGACAA 
CCTTTCCAAGGCTGAGGCTGTCAGAAACTTCCTGGCCTTCATTCAGCACCTCAGGACTGCGAAGAAAGAA 
GAGATCCTTCAAATACTAAAGATGGAAAATAAGGAAGTATTACCTCAGCTGGTGGATGCTGTCACCTCTG 

40 CTCAGACCTCAGACTCATTAGAAGCCATTTTGGACTTTTTGGATTTCAAAAGTGACAGCAGCATTATCCT 
CCAGGAGAGGTTTCTCTATGCCTGTGGATTTGCTTCTCATCCCAATGAAGAACTCCTGAGAGCCCTCATT 
AGTAAGTTCAAAGGTTCTATTGGTAGCAGTGACATCAGAGAAACTGTTATGATCATCACTGGGACACTTG 
TCAGAAAGTTGTGTCAGAATGAAGGCTGCAAACTCAAAGCAGTAGTGGAAGCTAAGAAGTTAATCCTGGG 
AGGACTTGAAAAAGCAGAGAAAAAAGAGGACACCAGGATGTATCTGCTGGCTTTGAAGAATGCCCTGCTT 

45 CCAGAAGGCATCCCAAGTCTTCTGAAGTATGCAGAAGCAGGAGAAGGGCCCATCAGCCACCTGGCTACCA 
CTGCTCTCCAGAGATATGATGCTCCCTTTCATAACTGATGAGGTGAAGAAGACCTTAAACAGAATATACC 
ACCAAAACCGTAAAGTTCATGAAAAGACTGTGCGCACTGCTGCAGCTGCTATCATTTTAAATAACAATCC 
ATCCTACATGGACGTCAAGAACATCCTGCTGTCTATTGGGGAGCTTCCCCAAGAAATGAATAAATACATG 
CTCGCCATTGTTCAAGACATCCTACGTTTTGAAATGCCTGCAAGCAAAATTGTCCGTCGAGTTCTGAAGG 

50 AAATGGTCGCTCACAATTATGACCGTTTCTCCAGGAGTGGATCTTCTTCTGCCTACACTGGCTACATAGA 
ACGTAGTCCCCGTTCGGCATCTACTTACAGCCTAGACATTCTCTACTCGGGTTCTGGCATTCTAAGGAGA 
AGTAACCTGAACATCTTTCAGTACATTGGGAAGGCTGGTCTTCACGGTAGCCAGGTGGTTATTGAAGCCC 
AAGGACTGG7\AGCCTTAATCGCAGCCACCCCTGACGAGGGGGAGGAGAACCTTGACTCCTATGCTGGTAT 
GTCAGCCATCCTCTTTGATGTTCAGCTCAGACCTGTCACCTTTTTCAACGGATACAGTGATTTGATGTCC 

55 AAAATGCTGTCAGCATCTGGCGACCCTATCAGTGTGGTGAAAGGACTTATTCTGCTAATAGATCATTCTC 
AGGAACTTCAGTTACAATCTGGACTAAAAGCCAATATAGAGGTCCAGGGTGGTCTAGCTATTGATATTTC 
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AGGTGCAATGGAGTTTAGCTTGTGGTATCGTGAGTCTAAAACCCGAGTGAAAAATAGGGTGACTGTGGTA 
ATAACCACTGACATCACAGTGGACTCCTCTTTTGTGAAAGCTGGCCTGGAAACCAGTACAGAAACAGAAG 
CAGGCTTGGAGTTTATCTCCACAGTGCAGTTTTCTCAGTACCCATTCTTAGTTTGCATGCAGATGGACAA 
GGATGAAGCTCCATTCAGGCAATTTGAGAAAAAGTACGAAAGGCTGTCCACAGGCAGAGGTTATGTCTCT 
5 CAGAAAAGAAAAGAAAGCGTATTAGCAGGATGTGAATTCCCGCTCCATCAAGAGAACTCAGAGATGTGCA 
AAGTGGTGTTTGCCCCTCAGCCGGATAGTACTTCCAGCGGATGGTTTTGAAACTGACCTGTGATATTTTA 
CTTGAATTTGTCTCCCCGAAAGGGACACAATGTGGCATGACTAAGTACTTGCTCTCTGAGAGCACAGCGT 
TTACATATTTACCTGTATTTAAGATTTTTGTAAAAAGCTACAAAAAACTGCAGTTTGATCAAATTTGGGT 
ATATGCAGTATGCTACCCACAGCGTCATTTTGAATCATCATGTGACGCTTTCAACAACGTTCTTAGTTTA 

10 CTTATACCTCTCTCAAATCTCATTTGGTACAGTCAGAATAGTTATTCTCTAAGAGGAAACTAGTGTTTGT 
TAAAAACAAAAATAAAAACAAAACCACACAAGGAGAACCCAATTTTGTTTCAACAATTTTTGATCAATGT 
ATATGAAGCTCTTGATAGGACTTCCTTAAGCATGACGGGAAAACCAAACACGTTCCCTAATCAGGAAAAA 
AAAAAAAAAAGAAAAAGTAAGACACAAACAAACCATTTTTTTCTCTTTTTTTGGAGTTGGGGGCCCAGGG 
AGAAGGGACAAGGCTTTTAAAAGACTTGTTAGCCAACTTCAAGAATTAATATTTATGTCTCTGTTATTGT 

15 TAGTTTTAAGCCTTAAGGTAGAAGGCACATAGAAATAACATCTCATCTTTCTGCTGACCATTTTAGTGAG 
GTTGTTCCAAAGAGCATTCAGGTCTCTACCTCCAGCCCTGCAAAAATATTGGACCTAGCACAGAGGAATC 
AGGAAAATTAATTTCAGAAACTCCATTTGATTTTTCTTTTGCTGTGTCTTTTTTGAGACTGTAATATGGT 
ACACTGTCCTCTAAGGACATCCTCATTTTATCTCACCTTTTTGGGGGTGAGAGCTCTAGTTCATTTAACT 
GTACTCTGCACAATAGCTAGGATGACTAAGAGAACATTGCTTCAAGAAACTGGTGGATTTGGATTTCCAA 

20 AATATGAAATAAGGAGAAAAATGTTTTTATTTGTATGAATTAAAAGATCCATGTTGAACATTTGCAAATA 
TTTATTAATAAACAGATGTGGTGATAAACCCAAAACAAATGACAGGTGCTTATTTTCCACTAAACACAGA 
CACATGAAATGAAAGTTTAGCTAGCCCACTATTTGTTGTAAATTGAAAACGAAGTGTGATAAAATAAATA 
TGTAGAAATCAAAAAAAAAAAAAAAAAAAA (SEQ ID NO: 6684) 



25 

>gi|21361125|ref |NM_0014 67.2| Homo sapiens glucose-6-phosphatase, 
transport (glucose-6-phosphate) protein 1 (G6PT1) , mRNA 

GGCACGAGGGGCCACCGAGGCGCTGTCCCTGACCACCAGCACGAGACCCCTTTCTATCGCGCCAGTCCTG 
TGGTCTCCGCACCTCTCCAGCTCCTGCACCCCCGGCCCCCGTGGTTCCCAGCCGCACAGTAGCGTGTCCT 

30 GGGTAGCGTGAGGACCCACGGGGCTGAGCAGGTGCCACGAGCCCGCCGCCTCTTCGCCGCCCGCCGCCTC 
TCCTCCTCTCCCGCCCGCCGCCTGGCCCTCCCCTACCAGGCTGAGCCTCTGGCTGCCAGAAGCGCGGGGC 
CTCCGGGAGAATACGTGCGGTCGCCCGCTCCGCGTGCGCCTACGCCTTCTGCTCCAGTTGCTTTCCCAAT 
TGAGCGGAAAAGCCGGGGCATGTTGCCGGGGCCCTGGGCGGGACGGTTGTGCCCTGCAGCCCGAAGCCCG 
CCGGGGCACCTTCCCGCCCACGAGCTGCCCAGTCCCTCTGCTTGCGGCCCCTGCCAACGTCCCACAGGAC 

35 ACTGGGTCCCCTTGGAGCCTCCCCAGGCTTAATGATTGTCCAGAAGGCGGCTATAAAGGGAGCCTGGGAG 
GCTGGGTGGAGGAGGGAGCAGAAAAAACCCAACTCAGCAGATCTGGGAACTGTGAGAGCGGCAAGCAGGA 
ACTGTGGTCAGAGGCTGTGCGTCTTGGCTGGTAGGGCCTGCTCTTTTCTACCATGGCAGCCCAGGGCTAT 
GGCTATTATCGCACTGTGATCTTCTCAGCCATGTTTGGGGGCTACAGCCTGTATTACTTCAATCGCAAGA 
CCTTCTCCTTTGTCATGCCATCATTGGTGGAAGAGATCCCTTTGGACAAGGATGATTTGGGGTTCATCAC 

40 CAGCAGCCAGTCGGCAGCTTATGCTATCAGCAAGTTTGTCAGTGGGGTGCTGTCTGACCAGATGAGTGCT 
CGCTGGCTCTTCTCTTCTGGGCTGCTCCTGGTTGGCCTGGTCAACATATTCTTTGCCTGGAGCTCCACAG 
TACCTGTCTTTGCTGCCCTCTGGTTCCTTAATGGCCTGGCCCAGGGGCTGGGCTGGCCCCCATGTGGGAA 
GGTCCTGCGGAAGTGGTTTGAGCCATCTCAGTTTGGCACTTGGTGGGCCATCCTGTCAACCAGCATGAAC 
CTGGCTGGAGGGCTGGGCCCTATCCTGGCAACCATCCTTGCCCAGAGCTACAGCTGGCGCAGCACGCTGG 

45 CCCTATCTGGGGCACTGTGTGTGGTTGTCTCCTTCCTCTGTCTCCTGCTCATCCACAATGAACCTGCTGA 
TGTTGGACTCCGCAACCTGGACCCCATGCCCTCTGAGGGCAAGAAGGGCTCCTTGAAGGAGGAGAGCACC 
CTGCAGGAGCTGCTGCTGTCCCCTTACCTGTGGGTGCTCTCCACTGGTTACCTTGTGGTGTTTGGAGTAA 
AGACCTGCTGTACTGACTGGGGCCAGTTCTTCCTTATCCAGGAGAAAGGACAGTCAGCCCTTGTAGGTAG 
CTCCTACATGAGTGCCCTGGAAGTTGGGGGCCTTGTAGGCAGCATCGCAGCTGGCTACCTGTCAGACCGG 

50 GCCATGGCAAAGGCGGGACTGTCCAACTACGGGAACCCTCGCCATGGCCTGTTGCTGTTCATGATGGCTG 
GCATGACAGTGTCCATGTACCTCTTCCGGGTAACAGTGACCAGTGACTCCCCCAAGCTCTGGATCCTGGT 
ATTGGGAGCTGTATTTGGTTTCTCCTCGTATGGCCCCATTGCCCTGTTTGGAGTCATAGCCAACGAGAGT 
GCCCCTCCCAACTTGTGTGGCACCTCCCACGCCATTGTGGGACTCATGGCC2VATGTGGGCGGCTTTCTGG 
CTGGGCTGCCCTTCAGCACCATTGCCAAGCACTACAGTTGGAGCACAGCCTTCTGGGTGGCTGAAGTGAT 

55 TTGTGCGGCCAGCACGGCTGCCTTCTTCCTCCTACGAAACATCCGCACCAAGATGGGCCGAGTGTCCAAG 
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AAGGCTGAGTGAAGAGAGTCCAGGTTCCGGAGCACCATCCCACGGTGGCCTTCCCCCTGCACGCTCTGCG 

GGGAGAAAAGGAGGGGCCTGCCTGGCTAGCCCTGAACCTTTCACTTTCCATTTCTGCGCCTTTTCTGTCA 

CCCGGGTGGCGCTGGAAGTTATCAGTGGCTAGTGAGGTCCCAGCTCCCTGATCCTATGCTCTATTTAAAA 

GATAACCTTTGGCCTTAGACTCCGTTAGCTCCTATTTCCTGCCTTCAGACAAACAGGAAACTTCTGCAGT 

CAGGAAGGCTCCTGTACCCTTCTTCTTTTCCTAGGCCCTGTCCTGCCCGCATCCTACCCCATCCCCACCT 

GAAGTGAGGCTATCCCTGCAGCTGCAGGGCACTAATGACCCTTGACTTCTGCTGGGTCCTAAGTCCTCTC 

AGCAGTGGGTGACTGCTGTTGCCAATACCTCAGACTCCAGGGAAAGAGAGGAGGCCATCATTCTCACTGT 

ACCACTAGGCGCAGTTGGATATAGGTGGGAAGAAAAGGTGACTTGTTATAGAAGATTAAAACTAGATTTG 
ATACTGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA (SEQ ID NO: 6685) 



gil4503130|ref |NM_001904.1| Homo sapiens catenin (cadherin-associated 
protein), beta 1, 88kDa (CTNNBl), mRNA 

AAGCCTCTCGGTCTGTGGCAGCAGCGTTGGCCCGGCCCCGGGAGCGGAGAGCGAGGGGAGGCGGAGACGG 
AGGAAGGTCTGAGGAGCAGCTTCAGTCCCCGCCGAGCCGCCACCGCAGGTCGAGGACGGTCGGACTCCCG 
CGGCGGGAGGAGCCTGTTCCCCTGAGGGTATTTGAAGTATACCATACAACTGTTTTGAAAATCCAGCGTG 
GACAATGGCTACTCAAGCTGATTTGATGGAGTTGGACATGGCCATGGAACCAGACAGAAAAGCGGCTGTT 
AGTCACTGGCAGCAACAGTCTTACCTGGACTCTGGAATCCATTCTGGTGCCACTACCACAGCTCCTTCTC 
TGAGTGGTAAAGGCAATCCTGAGGAAGAGGATGTGGATACCTCCCAAGTCCTGTATGAGTGGGAACAGGG 
ATTTTCTCAGTCCTTCACTCAAGAACAAGTAGCTGATATTGATGGACAGTATGCAATGACTCGAGCTCAG 
AGGGTACGAGCTGCTATGTTCCCTGAGACATTAGATGAGGGCATGCAGATCCCATCTACACAGTTTGATG 
CTGCTCATCCCACTAATGTCCAGCGTTTGGCTGAACCATCACAGATGCTGAAACATGCAGTTGTAAACTT 
GATTAACTATCAAGATGATGCAGAACTTGCCACACGTGCAATCCCTGAACTGACAAAACTGCTAAATGAC 
GAGGACCAGGTGGTGGTTAATAAGGCTGCAGTTATGGTCCATCAGCTTTCTAAAAAGGAAGCTTCCAGAC 
ACGCTATCATGCGTTCTCCTCAGATGGTGTCTGCTATTGTACGTACCATGCAGAATACAAATGATGTAGA 
AACAGCTCGTTGTACCGCTGGGACCTTGCATAACCTTTCCCATCATCGTGAGGGCTTACTGGCCATCTTT 
AAGTCTGGAGGCATTCCTGCCCTGGTGAAAATGCTTGGTTCACCAGTGGATTCTGTGTTGTTTTATGCCA 
TTACAACTCTCCACAACCTTTTATTACATCAAGAAGGAGCTAAAATGGCAGTGCGTTTAGCTGGTGGGCT 
GCAGAAAATGGTTGCCTTGCTCAACAAAACAAATGTTAAATTCTTGGCTATTACGACAGACTGCCTTCAA 
ATTTTAGCTTATGGCAACCAAGAAAGCAAGCTCATCATACTGGCTAGTGGTGGACCCCAAGCTTTAGTAA 
ATATAATGAGGACCTATACTTACGAAAAACTACTGTGGACCACAAGCAGAGTGCTGAAGGTGCTATCTGT 
CTGCTCTAGTAATAAGCCGGCTATTGTAGAAGCTGGTGGAATGCAAGCTTTAGGACTTCACCTGACAGAT 
CCAAGTCAACGTCTTGTTCAGAACTGTCTTTGGACTCTCAGGAATCTTTCAGATGCTGCAACTAAACAGG 
AAGGGATGGAAGGTCTCCTTGGGACTCTTGTTCAGCTTCTGGGTTCAGATGATATAAATGTGGTCACCTG 
TGCAGCTGGAATTCTTTCTAACCTCACTTGCAATAATTATAAGAACAAGATGATGGTCTGCCAAGTGGGT 
GGTATAGAGGCTCTTGTGCGTACTGTCCTTCGGGCTGGTGACAGGGAAGACATCACTGAGCCTGCCATCT 
GTGCTCTTCGTCATCTGACCAGCCGACACCAAGAAGCAGAGATGGCCCAGAATGCAGTTCGCCTTCACTA 
TGGACTACCAGTTGTGGTTAAGCTCTTACACCCACCATCCCACTGGCCTCTGATAAAGGCTACTGTTGGA 
* TTGATTCGAAATCTTGCCCTTTGTCCCGCAAATCATGCACCTTTGCGTGAGCAGGGTGCCATTCCACGAC 
TAGTTCAGTTGCTTGTTCGTGCACATCAGGATACCCAGCGCCGTACGTCCATGGGTGGGACACAGCAGCA 
ATTTGTGGAGGGGGTCCGCATGGAAGAAATAGTTGAAGGTTGTACCGGAGCCCTTCACATCCTAGCTCGG 
GATGTTCACAACCGAATTGTTATCAGAGGACTAAATACCATTCCATTGTTTGTGCAGCTGCTTTATTCTC 
CCATTGAAAACATCCAAAGAGTAGCTGCAGGGGTCCTCTGTGAACTTGCTCAGGACAAGGAAGCTGCAGA 
AGCTATTGAAGCTGAGGGAGCCACAGCTCCTCTGACAGAGTTACTTCACTCTAGGAATGAAGGTGTGGCG 
ACATATGCAGCTGCTGTTTTGTTCCGAATGTCTGAGGACAAGCCACAAGATTACAAGAAACGGCTTTCAG 
TTGAGCTGACCAGCTCTCTCTTCAGAACAGAGCCAATGGCTTGGAATGAGACTGCTGATCTTGGACTTGA 
TATTGGTGCCCAGGGAGAACCCCTTGGATATCGCCAGGATGATCCTAGCTATCGTTCTTTTCACTCTGGT 
GGATATGGCCAGGATGCCTTGGGTATGGACCCCATGATGGAACATGAGATGGGTGGCCACCACCCTGGTG 
CTGACTATCCAGTTGATGGGCTGCCAGATCTGGGGCATGCCCAGGACCTCATGGATGGGCTGCCTCCAGG 
TGACAGCAATCAGCTGGCCTGGTTTGATACTGACCTGTAAATCATCCTTTAGCTGTATTGTCTGAACTTG 
CATTGTGATTGGCCTGTAGAGTTGCTGAGAGGGCTCGAGGGGTGGGCTGGTATCTCAGAAAGTGCCTGAC 
ACACTAACCAAGCTGAGTTTCCTATGGGAACAATTGAAGTAAACTTTTTGTTCTGGTCCTTTTTGGTCGA 
GGAGTAACAATACAAATGGATTTTGGGAGTGACTCAAGAAGTGAAGAATGCACAAGAATGGATCACAAGA 
TGGAATTTAGCAAACCCTAGCCTTGCTTGTTAAAATTTTTTTTTTTTTTTTTTTAAGAATATCTGTAATG 
GTACTGACTTTGCTTGCTTTGAAGTAGCTCTTTTTTTTTTTTTTTTTTTTTTTTTTTGCAGTAACTGTTT 
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TTTAAGTCTCTCGTAGTGTTAAGTTATAGTGAATACTGCTACAGCAATTTCTAATTTTTAAGAATTGAGT 

AATGGTGTAGAACACTAATTAATTCATAATCACTCTAATTAATTGTAATCTGAATAAAGTGTAACAATTG 

TGTAGCCTTTTTGTATAAAATAGACAAATAGAAAATGGTCCAATTAGTTTCCTTTTTAATATGCTTAAAA 

TAAGCAGGTGGATCTATTTCATGTTTTTGATCAAAAACTATTTGGGATATGTATGGGTAGGGTAAATCAG 

5 TAAGAGGTGTTATTTGGAACCTTGTTTTGGACAGTTTACCAGTTGCCTTTTATCCCAAAGTTGTTGTAAC 

CTGCTGTGATACGATGCTTCAAGAGAAAATGCGGTTATAAAAAATGGTTCAGAATTAAACTTTTAATTCA 
TT (SEQ ID NO: 6686) 



10 gi|18104977|ref 1NM_002827.2| Homo sapiens protein tyrosine phosphatase, 
non-receptor type 1 (PTPNl), mRNA 

GTGATGCGTAGTTCCGGCTGCCGGTTGACATGAAGAAGCAGCAGCGGCTAGGGCGGCGGTAGCTGCAGGG 
GTCGGGGATTGCAGCGGGCCTCGGGGCTAAGAGCGCGACGCGGCCTAGAGCGGCAGACGGCGCAGTGGGC 
CGAGAAGGAGGCGCAGCAGCCGCCCTGGCCCGTCATGGAGATGGAAAAGGAGTTCGAGCAGATCGACAAG 

15 TCCGGGAGCTGGGCGGCCATTTACCAGGATATCCGACATGAAGCCAGTGACTTCCCATGTAGAGTGGCCA 
AGCTTCCTAAGAACAAAAACCGAAATAGGTACAGAGACGTCAGTCCCTTTGACCATAGTCGGATTAAACT 
ACATCAAGAAGATAATGACTATATCAACGCTAGTTTGATAAAAATGGAAGAAGCCCAAAGGAGTTACATT 
CTTACCCAGGGCCCTTTGCCTAACACATGCGGTCACTTTTGGGAGATGGTGTGGGAGCAGAAAAGCAGGG 
GTGTCGTCATGCTCAACAGAGTGATGGAGAAAGGTTCGTTAAAATGCGCACAATACTGGCCACAAAAAGA 

20 AGAAAAAGAGATGATCTTTGAAGACACAAATTTGAAATTAACATTGATCTCTGAAGATATCAAGTCATAT 
TATACAGTGCGACAGCTAGAATTGGAAAACCTTACAACCCAAGAAACTCGAGAGATCTTACATTTCCACT 
ATACCACATGGCCTGACTTTGGAGTCCCTGAATCACCAGCCTCATTCTTGAACTTTCTTTTCAAAGTCCG 
AGAGTCAGGGTCACTCAGCCCGGAGCACGGGCCCGTTGTGGTGCACTGCAGTGCAGGCATCGGCAGGTCT 
GGAACCTTCTGTCTGGCTGATACCTGCCTCTTGCTGATGGACAAGAGGAAAGACCCTTCTTCCGTTGATA 

25 TCAAGAAAGTGCTGTTAGAAATGAGGAAGTTTCGGATGGGGCTGATCCAGACAGCCGACCAGCTGCGCTT 
CTCCTACCTGGCTGTGATCGAAGGTGCCAAATTCATCATGGGGGACTCTTCCGTGCAGGATCAGTGGAAG 
GAGCTTTCCCACGAGGACCTGGAGCCCCCACCCGAGCATATCCCCCCACCTCCCCGGCCACCCAAACGAA 
TCCTGGAGCCACACAATGGGAAATGCAGGGAGTTCTTCCCAAATCACCAGTGGGTGAAGGAAGAGACCCA 
GGAGGATAAAGACTGCCCCATCAAGGAAGAAAAAGGAAGCCCCTTAAATGCCGCACCCTACGGCATCGAA 

30 AGCATGAGTCAAGACACTGAAGTTAGAAGTCGGGTCGTGGGGGGAAGTCTTCGAGGTGCCCAGGCTGCCT 
CCCCAGCCAAAGGGGAGCCGTCACTGCCCGAGAAGGACGAGGACCATGCACTGAGTTACTGGAAGCCCTT 
CCTGGTCAACATGTGCGTGGCTACGGTCCTCACGGCCGGCGCTTACCTCTGCTACAGGTTCCTGTTCAAC 
AGCAACACATAGCCTGACCCTCCTCCACTCCACCTCCACCCACTGTCCGCCTCTGCCCGCAGAGCCCACG 
CCCGACTAGCAGGCATGCCGCGGTAGGTAAGGGCCGCCGGACCGCGTAGAGAGCCGGGCCCCGGACGGAC 

35 GTTGGTTCTGCACTAAAACCCATCTTCCCCGGATGTGTGTCTCACCCCTCATCCTTTTACTTTTTGCCCC 
TTCCACTTTGAGTACCAAATCCACAAGCCATTTTTTGAGGAGAGTGAAAGAGAGTACCATGCTGGCGGCG 
CAGAGGGAAGGGGCCTACACCCGTCTTGGGGCTCGCCCCACCCAGGGCTCCCTCCTGGAGCATCCCAGGC 
GGGCGGCACGCCAACAGCCCCCCCCTTGAATCTGCAGGGAGCAACTCTCCACTCCATATTTATTTAAACA 
ATTTTTTCCCCAAAGGCATCCATAGTGCACTAGCATTTTCTTGAACCAATAATGTATTAAAATTTTTTGA 

40 TGTCAGCCTTGCATCAAGGGCTTTATCAAAAAGTACAATAATAAATCCTCAGGTAGTACTGGGAATGGAA 
GGCTTTGCCATGGGCCTGCTGCGTCAGACCAGTACTGGGAAGGAGGACGGTTGTAAGCAGTTGTTATTTA 
GTGATATTGTGGGTAACGTGAGAAGATAGAACAATGCTATAATATATAATGAACACGTGGGTATTTAATA 
AGAAACATGATGTGAGATTACTTTGTCCCGCTTATTCTCCTCCCTGTTATCTGCTAGATCTAGTTCTCAA 
TCACTGCTCCCCCGTGTGTATTAGAATGCATGTAAGGTCTTCTTGTGTCCTGATGAAAAATATGTGCTTG 

45 AAATGAGAAACTTTGATCTCTGCTTACTAATGTGCCCCATGTCCAAGTCCAACCTGCCTGTGCATGACCT 
GATCATTACATGGCTGTGGTTCCTAAGCCTGTTGCTGAAGTCATTGTCGCTCAGCAATAGGGTGCAGTTT 
TCCAGGAATAGGCATTTGCCTAATTCCTGGCATGACACTCTAGTGACTTCCTGGTGAGGCCCAGCCTGTC 
CTGGTACAGCAGGGTCTTGCTGTAACTCAGACATTCCAAGGGTATGGGAAGCCATATTCACACCTCACGC 
TCTGGACATGATTTAGGGAAGCAGGGACACCCCCCGCCCCCCACCTTTGGGATCAGCCTCCGCCATTCCA 

50 AGTCAACACTCTTCTTGAGCAGACCGTGATTTGGAAGAGAGGCACCTGCTGGAAACCACACTTCTTGAAA 
CAGCCTGGGTGACGGTCCTTTAGGCAGCCTGCCGCCGTCTCTGTCCCGGTTCACCTTGCCGAGAGAGGCG 
CGTCTGCCCCACCCTCAAACCCTGTGGGGCCTGATGGTGCTCACGACTCTTCCTGCAAAGGGAACTGAAG 
ACCTCCACATTAAGTGGCTTTTTAACATGAAAAACACGGCAGCTGTAGCTCCCGAGCTACTCTCTTGCCA 
GCATTTTCACATTTTGCCTTTCTCGTGGTAGAAGCCAGTACAGAGAAATTGTGTGGTGGGAACATTCGAG 

55 GTGTCACCCTGCAGAGCTATGGTGAGGTGTGGATAAGGCTTAGGTGCCAGGCTGTAAGCATTCTGAGCTG 
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GGCTTGTTGTTTTTAAGTCCTGTATATGTATGTAGTAGTTTGGGTGTGTATATATAGTAGCATTTCAAAA 

TGGACGTACTGGTTTAACCTCCTATCCTTGGAGAGCAGCTGGCTCTCCACCTTGTTACACATTATGTTAG 

AGAGGTAGCGAGCTGCTCTGCTATATGCCTTAAGCCAATATTTACTCATCAGGTCATTATTTTTTACAAT 
GGCCATGGAATAAACCATTTTTACAAAA (SEQ ID NO: 6687) 



gi j 12831192 1 gb I AF333324 . 1 1 Hepatitis C virus type lb polyprotein niRNA, 
complete cds 

GCCAGCCCCCGATTGGGGGCGACACTCCACCATAGATCACTCCCCTGTGAGGAACTACTGTCTTCACGCA 

10 GAAAGCGTCTAGCCATGGCGTTAGTATGAGTGTCGTGCAGCCTCCAGGACCCCCCCTCCCGGGAGAGCCA 
TAGTGGTCTGCGGAACCGGTGAGTACACCGGAATTGCCAGGACGACCGGGTCCTTTCTTGGATCAACCCG 
CTCAATGCCTGGAGATTTGGGCGTGCCCCCGCGAGACTGCTAGCCGAGTAGTGTTGGGTCGCGAAAGGCC 
TTGTGGTACTGCCTGATAGGGTGCTTGCGAGTGCCCCGGGAGGTCTCGTAGACCGTGCATCATGAGCACA 
AATCCTAAACCTCAAAGAAAAACCAAACGTAACACCAACCGCCGCCCACAGGACGTTAAGTTCCCGGGCG 

15 GTGGTCAGATCGTTGGTGGAGTTTACCTGTTGCCGCGCAGGGGCCCCAGGTTGGGTGTGCGCGCGACTAG 
GAAGACTTCCGAGCGGTCGCAACCTCGTGGAAGGCGACAACCTATCCCCAAGGCTCGCCGGCCCGAGGGT 
AGGACCTGGGCTCAGCCCGGGTACCCTTGGCCCCTCTATGGCAACGAGGGTATGGGGTGGGCAGGATGGC 
TCCTGTCACCCCGTGGCTCTCGGCCTAGTTGGGGCCCCACAGACCCCCGGCGTAGGTCGCGTAATTTGGG 
TAAGGTCATCGATACCCTTACATGCGGCTTCGCCGACCTCATGGGGTACATTCCGCTTGTCGGCGCCCCC 

20 CTAGGAGGCGCTGCCAGGGCCCTGGCGCATGGCGTCCGGGTTCTGGAGGACGGCGTGAACTATGCAACAG 
GGAATCTGCCCGGTTGCTCTTTCTCTATCTTCCTCTTAGCTTTGCTGTCTTGTTTGACCATCCCAGCTTC 
CGCTTACGAGGTGCGCAACGTGTCCGGGATATACCATGTCACGAACGACTGCTCCAACTCAAGTATTGTG 
TATGAGGCAGCGGACATGATCATGCACACCCCCGGGTGCGTGCCCTGCGTCCGGGAGAGTAATTTCTCCC 
GTTGCTGGGTAGCGCTCACTCCCACGCTCGCGGCCAGGAACAGCAGCATCCCCACCACGACAATACGACG 

25 CCACGTCGATTTGCTCGTTGGGGCGGCTGCTCTCTGTTCCGCTATGTACGTTGGGGATCTCTGCGGATCC 
GTTTTTCTCGTCTCCCAGCTGTTCACCTTCTCACCTCGCCGGTATGAGACGGTACAAGATTGCAATTGCT 
CAATCTATCCCGGCCACGTATCAGGTCACCGCATGGCTTGGGATATGATGATGAACTGGTCACCTACAAC 
GGCCCTAGTGGTATCGCAGCTACTCCGGATCCCACAAGCCGTCGTGGACATGGTGGCGGGGGCCCACTGG 
GGTGTCCTAGCGGGCCTTGCCTACTATTCCATGGTGGGGAACTGGGCTAAGGTCTTGAT-TGTGATGCTAC 

30 TCTTTGCTGGCGTTGACGGGCACACCCACGTGACAGGGGGAAGGGTAGCCTCCAGCACCCAGAGCCTCGT 
GTCCTGGCTCTCACAAGGGCCATCTCAGAAAATCCAACTCGTGAACACCAACGGCAGCTGGCACATCAAC 
AGGACCGCTCTGAATTGCAATGACTCCCTCCAAACTGGGTTCATTGCTGCGCTGTTCTACGCACACAGGT 
TCAACGCGTCCGGATGTCCAGAGCGCATGGCCAGCTGCCGCCCCATCGACAAGTTCGCTCAGGGGTGGGG 
TCCCATCACTCACGTTGTGCCTAACATCTCGGACCAGAGGCCTTATTGCTGGCACTATGCACCCCAACCG 

35 TGCGGTATTGTACCCGCGTCGCAGGTGTGTGGCCCAGTGTATTGCTTCACCCCGAGTCCTGTTGTGGTGG 
GGACGACCGACCGTTCCGGAGTCCCCACGTATAGCTGGGGGGAGAATGAGACAGACGTGCTGCTACTCAA 
CAACACGCGGCCGCCGCAAGGCAACTGGTTCGGCTGTACATGGATGAATAGCACCGGGTTCACCAAGACG 
TGCGGGGGCCCCCCGTGTAACATCGGGGGGGTTGGCAACAACACCTTGATTTGCCCCACGGATTGCTTCC 
GAAAGCACCCCGAGGCCACTTACACCAAATGCGGCTCGGGTCCTTGGTTGACACCTAGGTGTCTAGTTGA 

40 CTACCCATACAGACTTTGGCACTACCCCTGCACTATCAATTTTACCATCTTCAAGGTCAGGATGTACGTG 
GGGGGCGTGGAGCACAGGCTCAACGCCGCGTGCAATTGGACCCGAGGAGAGCGCTGTGACCTGGAGGACA 
GGGATAGATCAGAGCTTAGCCCGCTGCTATTGTCTACAACGGAGTGGCAGGTACTGCCCTGTTCCTTTAC 
CACCCTACCGGCTCTGTCCACTGGATTGATCCACCTCCATCAGAATATCGTGGACGTGCAATACCTGTAC 
GGTGTAGGGTCAGTGGTTGTCTCCGTCGTAATCAAATGGGAGTATGTTCTGCTGCTCTTCCTTCTCCTGG 

45 CGGACGCGCGCGTCTGTGCCTGCTTGTGGATGATGCTGCTGATAGCCCAGGCTGAGGCCACCTTAGAGAA 
CCTGGTGGTCCTCAATGCGGCGTCTGTGGCCGGAGCGCATGGCCTTCTCTCCTTCCTCGTGTTCTTCTGC 
GCCGCCTGGTACATCAAAGGCAGGCTGGTCCCTGGGGCGGCATATGCTCTCTATGGCGTATGGCCGTTGC 
TCCTGCTCTTGCTGGCTTTACCACCACGAGCTTATGCCATGGACCGAGAGATGGCTGCATCGTGCGGAGG 
CGCGGTTTTTGTAGGTCTGGTACTCTTGACCTTGTCACCATACTATAAGGTGTTCCTCGCTAGGCTCATA 

50 TGGTGGTTACAATATTTTATCACCAGGGCCGAGGCGCACTTGCAAGTGTGGGTCCCCCCTCTTAATGTTC 
GGGGAGGCCGCGATGCCATCATCCTCCTTACATGCGCGGTCCATCCAGAGCTAATCTTTGACATCACCAA 
ACTCCTGCTCGCCATACTCGGTCCGCTCATGGTGCTCCAAGCTGGCATAACCAGAGTGCCGTACTTCGTG 
CGCGCTCAAGGGCTCATTCATGCATGCATGTTAGTGCGGAAGGTCGCTGGGGGTCATTATGTCCAAATGG 
CCTTCATGAAGCTGGGCGCGCTGACAGGCACGTACATTTACAACCATCTTACCCCGCTACGGGATTGGGC 

55 CCACGCGGGCCTACGAGACCTTGCGGTGGCAGTGGAGCCCGTCGTCTTCTCCGACATGGAGACCAAGATC 
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ATCACCTGGGGAGCAGACACCGCGGCGTGTGGGGACATCATCTTGGGTCTGCCCGTCTCCGCCCGAAGGG 
GAAAGGAGATACTCCTGGGCCCGGCCGATAGTCTTGAAGGGCGGGGGTGGCGACTCCTCGCGCCCATCAC 
GGCCTACTCCCAACAGACGCGGGGCCTACTTGGTTGCATCATCACTAGCCTTACAGGCCGGGACAAGAAC 
CAGGTCGAGGGAGAGGTTCAGGTGGTTTCCACCGCAACACAATCCTTCCTGGCGACCTGCGTCAACGGCG 
5 TGTGTTGGACCGTTTACCATGGTGCTGGCTCT^AAGACCTTAGCCGGCCCAAAGGGGCCAATCACCCAGAT 
GTACACTAATGTGGACCAGGACCTCGTCGGCTGGCAGGCGCCCCCCGGGGCGCGTTCCTTGACACCATGC 
ACCTGTGGCAGCTCAGACCTTTACTTGGTCACGAGACATGCTGACGTCATTCCGGTGCGCCGGCGGGGCG 
ACAGTAGGGGGAGCCTGCTCTCCCCCAGGCCTGTCTCCTACTTGAAGGGCTCTTCGGGTGGTCCACTGCT 
CTGCCCTTCGGGGCACGCTGTGGGCATCTTCCGGGCTGCCGTATGCACCCGGGGGGTTGCGAAGGCGGTG 
10 GACTTTGTGCCCGTAGAGTCCATGGAAACTACTATGCGGTCTCCGGTCTTCACGGACAACTCATCCCCCC 
CGGCCGTACCGCAGTCATTTCAAGTGGCCCACCTACACGCTCCCACTGGCAGCGGCAAGAGTACTAAAGT 
GCCGGCTGCATATGCAGCCCAAGGGTACAAGGTGCTCGTCCTCAATCCGTCCGTTGCCGCTACCTTAGGG 
TTTGGGGCGTATATGTCTAAGGCACACGGTATTGACCCCAACATCAGAACTGGGGTAAGGACCATTACCA 
CAGGCGCCCCCGTCACATACTCTACCTATGGCAAGTTTCTTGCCGATGGTGGTTGCTCTGGGGGCGCTTA 
15 TGACATCATAATATGTGATGAGTGCCATTCAACTGACTCGACTACAATCTTGGGCATCGGCACAGTCCTG 
GACCAAGCGGAGACGGCTGGAGCGCGGCTTGTCGTGCTCGCCACCGCTACGCCTCCGGGATCGGTCACCG 
TGCCACACCCAAACATCGAGGAGGTGGCCCTGTCTAATACTGGAGAGATCCCCTTCTATGGCAAAGCCAT 
CCCCATTGAAGCCATCAGGGGGGGAAGGCATCTCATTTTCTGTCATTCCAAGAAGAAGTGCGACGAGCTC 
GCCGCAAAGCTGTCAGGCCTCGG7\ATCAACGCTGTGGCGTATTACCGGGGGCTCGATGTGTCCGTCATAC 
20 CAACTATCGGAGACGTCGTTGTCGTGGCAACAGACGCTCTGATGACGGGCTATACGGGCGACTTTGACTC 
AGTGATCGACTGTAACACATGTGTCACCCAGACAGTCGACTTCAGCTTGGATCCCACCTTCACCATTGAG 
ACGACGACCGTGCCTCAAGACGCAGTGTCGCGCTCGCAGCGGCGGGGTAGGACTGGCAGGGGTAGGAGAG 
GCATCTACAGGTTTGTGACTCCGGGAGAACGGCCCTCGGGCATGTTCGATTCCTCGGTCCTGTGTGAGTG 
CTATGACGCGGGCTGTGCTTGGTACGAGCTCACCCCCGCCGAGACCTCGGTTAGGTTGCGGGCCTACCTG 
25 AACACACCAGGGTTGCCCGTTTGCCAGGACCACCTGGAGTTCTGGGAGAGTGTCTTCACAGGCCTCACCC 
ACATAGATGCACACTTCTTGTCCCAGACCAAGCAGGCAGGAGACAACTTCCCCTACCTGGTAGCATACCA 
AGCCACGGTGTGCGCCAGGGCTCAGGCCCCACCTCCATCATGGGATCAAATGTGGAAGTGTCTCATACGG 
CTGAAACCTACGCTGCACGGGCCAACACCCTTGCTGTACAGGCTGGGAGCCGTCCAAAATGAGGTCACCC 
TCACCCACCCCATAACCAAATACATCATGGCATGCATGTCGGCTGACCTGGAGGTCGTCACTAGCACCTG 
30 GGTGCTGGTGGGCGGAGTCCTTGCAGCTCTGGCCGCGTATTGCCTGACAACAGGCAGTGTGGTCATTGTG 
GGTAGGATTATCTTGTCCGGGAGGCCGGCTATTGTTCCCGACAGGGAGCTTCTCTACCAGGAGTTCGATG 
AAATGGAAGAGTGCGCCACGCACCTCCCTTACATTGAGCAGGGAATGCAGCTCGCCGAGCAGTTCAAGCA 
GT^AAGCGCTCGGGTTACTGCAAACAGCCACCAAACAAGCGGAGGCTGCTGCTCCCGTGGTGGAGTCCAAG 
TGGCGAGCCCTTGAGACATTCTGGGCGAAGCACATGTGGAATTTCATCAGCGGGATACAGTACTTAGCAG 
35 GCTTATCCACTCTGCCTGGGAACCCCGCAATAGCATCATTGATGGCATTCACAGCCTCTATCACCAGCCC 
GCTCACCACCCAAAGTACCCTCCTGTTTAACATCTTGGGGGGGTGGGTGGCTGCCCAACTCGCCCCCCCC 
AGCGCCGCTTCGGCTTTCGTGGGCGCCGGCATCGCCGGTGCGGCTGTTGGCAGCATAGGCCTTGGGT^GG 
TGCTTGTGGACATTCTGGCGGGTTATGGAGCAGGAGTGGCCGGCGCGCTCGTGGCCTTTAAGGTCATGAG 
CGGCGAGATGCCCTCTACCGAGGACCTGGTCAATCTACTTCCTGCCATCCTCTCTCCTGGCGCCCTGGTC 
40 GTCGGGGTCGTGTGTGCAGCAATACTGCGTCGGCACGTGGGTCCGGGAGAGGGGGCTGTGCAGTGGATGA 
ACCGGCTGATAGCGTTCGCCTCGCGGGGTAATCACGTTTCCCCCACGCACTATGTGCCTGAGAGCGACGC 
CGCAGCGCGTGTTACTCAGATCCTCTCCAGCCTTACCATCACTCAGCTGCTGAAAAGGCTCCACCAGTGG 
ATTAATGAGGACTGCTCCACACCGTGTTCCGGCTCGTGGCTAAGGGATGTTTGGGACTGGATATGCACGG 
TGTTGACTGACTTCAAGACCTGGCTCCAGTCCAAGCTCCTGCCGCAGCTACCGGGAGTCCCTTTTTTCTC 
45 GTGCCAACGCGGGTACAAGGGAGTCTGGCGGGGAGACGGCATCATGC7\AACCACCTGCCCATGTGGAGCA 
CAGATCACCGGACATGTCAAAA?\CGGTTCCATGAGGATCGTCGGGCCTAAGACCTGCAGCAACACGTGGC 
ATGGAACATTCCCCATCAACGCATACACCACGGGCCCCTGCACACCCTCTCCAGCGCCAAACTATTCTAG 
GGCGCTGTGGCGGGTGGCCGCTGAGGAGTACGTGGAGGTCACGCGGGTGGGGGATTTCCACTACGTGACG 
GGCATGACCACTGACAACGTAAAGTGCCCATGCCAGGTTCCGGCTCCTGAATTCTTCTCGGAGGTGGACG 
50 GAGTGCGGTTGCACAGGTACGCTCCGGCGTGCAGGCCTCTCCTACGGGAGGAGGTTACATTCCAGGTCGG 
GCTCAACCAATACCTGGTTGGGTCACAGCTACCATGCGAGCCCGAACCGGATGTAGCAGTGCTCACTTCC 
ATGCTCACCGACCCCTCCCACATCACAGCAGAAz\CGGCTAAGCGTAGGTTGGCCAGGGGGTCTCCCCCCT 
CCTTGGCCAGCTCTTCAGCTAGCCAGTTGTCTGCGCCTTCCTTGAAGGCGACATGCACTACCCACCATGT 
CTCTCCGGACGCTGACCTCATCGAGGCCAACCTCCTGTGGCGGCAGGAGATGGGCGGGAACATCACCCGC 
55 GTGGAGTCGGAGAACAAGGTGGTAGTCCTGGACTCTTTCGACCCGCTTCGAGCGGAGGAGGATGAGAGGG 
AAGTATCCGTTCCGGCGGAGATCCTGCGGAAATCCAAGAAGTTCCCCGCAGCGATGCCCATCTGGGCGCG 
CCCGGATTACAACCCTCCACTGTTAGAGTCCTGGAAGGACCCGGACTACGTCCCTCCGGTGGTGCACGGG 
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TGCCCGTTGCCACCTATCAAGGCCCCTCCAATACCACCTCCACGGAGAAAGAGGACGGTTGTCCTAACAG 
AGTCCTCCGTGTCTTCTGCCTTAGCGGAGCTCGCTACTAAGACCTTCGGCAGCTCCGAATCATCGGCCGT 
CGACAGCGGCACGGCGACCGCCCTTCCTGACCAGGCCTCCGACGACGGTGACAAAGGATCCGACGTTGAG 
TCGTACTCCTCCATGCCCCCCCTTGAGGGGGAACCGGGGGACCCCGATCTCAGTGACGGGTCTTGGTCTA 
5 CCGTGAGCGAGGAAGCTAGTGAGGATGTCGTCTGCTGCTCAATGTCCTACACATGGACAGGCGCCTTGAT 
CACGCCATGCGCTGCGGAGGAAAGCAAGCTGCCCATCAACGCGTTGAGCAACTCTTTGCTGCGCCACCAT 
AACATGGTTTATGCCACAACATCTCGCAGCGCAGGCCTGCGGCAGAAGAAGGTCACCTTTGACAGACTGC 
AAGTCCTGGACGACCACTACCGGGACGTGCTCAAGGAGATGAAGGCGAAGGCGTCCACAGTTAAGGCTAA 
ACTCCTATCCGTAGAGGAAGCCTGCAAGCTGACGCCCCCACATTCGGCCAAATCCAAGTTTGGCTATGGG 

10 GCAAAGGACGTCCGGAACCTATCCAGCAAGGCCGTTAACCACATCCACTCCGTGTGGAAGGACTTGCTGG 
AAGACACTGTGACACCAATTGACACCACCATCATGGCAAAAAATGAGGTTTTCTGTGTCCAACCAGAGAA 
AGGAGGCCGTAAGCCAGCCCGCCTTATCGTATTCCCAGATCTGGGAGTCCGTGTATGCGAGAAGATGGCC 
CTCTATGATGTGGTCTCCACCCTTCCTCAGGTCGTGATGGGCTCCTCATACGGATTCCAGTACTCTCCTG 
GGCAGCGAGTCGAGTTCCTGGTGAATACCTGGAAATCAAAGAAAAACCCCATGGGCTTTTCATATGACAC 

15 TCGCTGTTTCGACTCAACGGTCACCGAGAACGACATCCGTGTTGAGGAGTCAATTTACCAATGTTGTGAC 
TTGGCCCCCGAAGCCAGACAGGCCATAAAATCGCTCACAGAGCGGCTTTATATCGGGGGTCCTCTGACTA 
ATTCAAAAGGGCAGAACTGCGGTTATCGCCGGTGCCGCGCGAGCGGCGTGCTGACGACTAGCTGCGGTAA 
CACCCTCACATGTTACTTGAAGGCCTCTGCAGCCTGTCGAGCTGCGAAGCTCCAGGACTGCACGATGCTC 
GTGAACGGAGACGACCTTGTCGTTATCTGTGAAAGCGCGGGAACCCAAGAGGACGCGGCGAGCCTACGAG 

20 TCTTCACGGAGGCTATGACTAGGTACTCTGCCCCCCCCGGGGACCCGCCCCAACCAGAATACGACTTGGA 
GCTGATAACATCATGTTCCTCCAATGTGTCGGTCGCCCACGATGCATCAGGCAAAAGGGTGTACTACCTC 
ACCCGTGATCCCACCACCCCCCTCGCACGGGCTGCGTGGGAAACAGCTAGACACACTCCAGTTAACTCCT 
GGCTAGGCAACATTATCATGTATGCGCCCACTTTGTGGGCAAGGATGATTCTGATGACTCACTTCTTCTC 
CATCCTTCTAGCACAGGAGCAACTTGAAAAAGCCCTGGACTGCCAGATCTACGGGGCCTGTTACTCCATT 

25 GAGCCACTTGACCTACCTCAGATCATTGAACGACTCCATGGCCTTAGCGCATTTTCACTCCATAGTTACT 
CTCCAGGTGAGATCAATAGGGTGGCTTCATGCCTCAGGAAACTTGGGGTACCACCCTTGCGAGTCTGGAG 
ACATCGGGCCAGGAGCGTCCGCGCTAGGCTACTGTCCCAGGGGGGGAGGGCCGCCACTTGTGGCAAGTAC 
CTCTTCAACTGGGCAGTGAAGACCAAACTCAAACTCACTCCAATCCCGGCTGCGTCCCAGCTGGACTTGT 
CCGGCTGGTTCGTTGCTGGTTACAGCGGGGGAGACATATATCACAGCCTGTCTCGTGCCCGACCCCGCTG 

30 GTTCATGCTGTGCCTACTCCTACTTTCTGTAGGGGTAGGCATCTACCTGCTCCCCAACCGATGAACGGGG 
AGCTAAACACTCCAGGCCAATAGGCCATTTCCTGTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTCT 
TTTCCTTCTTTTTCCCTTTTTCTTTCTTCCTTCTTTAATGGTGGCTCCATCTTAGCCCTAGTCACGGCTA 
GCTGTGAAAGGTCCGTGAGCCGCATGACTGCAGAGAGTGCTGATACTGGCCTCTCTGCAGATCATGT 
(SEQ ID NO: 6688) 

35 

gi 1 306286 1 gb I M96362 . 1 I HPCUNKCDS Hepatitis C virus luRNA, complete cds 
TGCCAGCCCCCGATTGGGGGCGACACTCCACCATAGATCACTCCCCTGTGAGGAACTACTGTCTTCACGC 
AGAAAGCGTCTAGCCATGGCGTTAGTATGAGTGTCGTGCAGCCTCCAGGACCCCCCCTCCCGGGAGAGCC 
ATAGTGGTCTGCGGAACCGGTGAGTACACCGGAATTGCCAGGACGACCGGGTCCTTTCTTGGATCAACCC 

40 GCTCAATGCCTGGAGATTTGGGCGTGCCCCCGCGAGACTGCTAGCCGAGTAGTGTTGGGTCGCGAAAGGC 
CTTGTGGTACTGCCTGATAGGGTGCTTGCGAGTGCCCCGGGAGGTCTCGTAGACCGTGCACCATGAGCAC 
GAATCCTAAACCTCAAAGAAAAACCAAACGTAACACCAACCGCCGCCCACAGGATATTAAGTTCCCGGGC 
GGTGGTCAGATCGTTGGTGGAGTTTACTTGTTGCCGCGCAGGGGCCCCAGGTTGGGTGTGCGCGCGACTA 
GGAAGACTTCCGAGCGGTCGCAACCTCGTGGAAGGCGACAGCCTATCCCCAAGGCTCGCCGGCCCGAGGG 

45 CAGGGCCTGGGCTCAGCCCGGGTACCCTTGGCCCCTCTATGGCAATGAGGGCTTGGGGTGGGCAGGATGG 
CTCCTGTCACCCCGCGGCTCCCGGCCTAGTTGGGGCCCCACGGACCCCCGGCGTAAGTCGCGTAATTTGG 
GTAAGGTCATCGACACCCTCACATGCGGCTTCGCCGACCTCATGGGGTACATTCCGCTCGTCGGCGCCCC 
CCTAGGGGGCGTTGCCAGGGCCCTGGCACATGGTGTCCGGGTGCTGGAGGACGGCGTGAACTATGCAACA 
GGGAATCTGCCCGGTTGCTCTTTCTCTATCTTCCTCTTGGCTCTGCTGTCTTGTTTGACCACCCCAGTTT 

50 CCGCTTATGAAGTGCGTAACGCGTCCGGGATGTACCATGTCACGAACGACTGCTCCAACTCAAGCATTGT 
GTATGAGGCAGCGGACATGATCATGCACACTCCCGGGTGCGTGCCCTGCGTTCGGGAGGACAACTCCTCC 
CGTTGCTGGGTGGCACTTACTCCCACGCTCGCGGCCAGGAATGCCAGCGTCCCCACTACGACATTGCGAC 
GCCATGTCGACTTGCTCGTTGGGGTAGCTGCTTTCTGTTCCGCTATGTACGTGGGGGACCTCTGCGGATC 
TGTTTTCCTTGTTTCCCAGCTGTTCACCTTTTCGCCTCGCCGGCATGAGACGGTACAGGACTGCAACTGC 

55 TCAATCTATCCCGGCCGCGTATCAGGTCACCGCATGGCCTGGGATATGATGATGAACTGGTCGCCTACAA 
CAGCCCTAGTGGTATCGCAGCTACTCCGGATCCCACAAGCTGTCGTGGACATGGTGACAGGGTCCCACTG 
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GGGAATCCTGGCGGGCCTTGCCTACTATTCCATGGTGGGGAACTGGGCTAAGGTCTTAATTGCGATGCTA 
CTCTTTGCCGGCGTTGACGGAACCACCCACGTGACAGGGGGGGCGCAAGGTCGGGCCGCTAGCTCGCTAA 
CGTCCCTCTTTAGCCCTGGGCCGGTTCAGCACCTCCAGCTCATAAACACCAACGGCAGCTGGCATATCAA 
CAGGACCGCCCTGAGCTGCAATGACTCCCTCAACACTGGGTTTGTTGCCGCGCTGTTCTACAAATACAGG 
5 TTCAACGCGTCCGGGTGCCCGGAGCGCTTGGCCACGTGCCGCCCCATTGATACATTCGCGCAGGGGTGGG 
GTCCCATCACTTACACTGAGCCTCATGATTTGGATCAGAGGCCCTATTGCTGGCACTACGCGCCTCAACC 
GTGTGGTATTGTGCCCACGTTGCAGGTGTGTGGCCCAGTATACTGCTTCACCCCGAGTCCTGTTGCGGTG 
GGGACTACCGATCGTTTCGGTGCCCCTACATACAGATGGGGGGCAAATGAGACGGACGTGCTGCTCCTTA 
ACAACGCCGGGCCGCCGCAAGGCAACTGGTTCGGCTGTACATGGATGAATGGCACTGGGTTCACCAAGAC 

10 ATGTGGGGGCCCCCCGTGTAACATCGGGGGGGTCGGCAACAATACCTTGACCTGCCCCACGGACTGCTTC 
CGAAAGCACCCCGGGGCCACTTACACCAAATGCGGTTCGGGGCCTTGGTTAACACCCAGGTGCTTAGTCG 
ACTACCCGTACAGGCTCTGGCATTACCCCTGCACTGTCAACTTTACCATCTTTAAGGTTAGGATGTACGT 
GGGGGGCGCGGAGCACAGGCTCGACGCCGCATGCAACTGGACTCGGGGAGAGCGTTGTGACCTGGAGGAC 
AGGGATAGGTCAGAGCTTAGCCCGCTGCTGCTGTCTACAACAGAGTGGCAGGTACTGCCCTGTTCCTTCA 

15 CAACCCTACCGGCTCTGTCCACTGGTTTGATTCATCTCCATCAGAACATCGTGGACATACAATACCTGTA 
CGGTATAGGGTCGGCGGTTGTCTCCTTTGCGATCAAATGGGAGTATATTGTGCTGCTCTTCCTTCTTCTG 
GCGGACGCGCGCGTCTGCGCTTGCTTGTGGATGATGCTGCTGGTAGCGCAAGCCGAGGCCGCCTTAGAGA 
ACCTGGTGGTCCTCAATGCAGCGTCCGTGGCCGGAGCGCATGGCATTCTTTCCTTCATTGTGTTCTTCTG 
TGCTGCCTGGTACATCAAGGGCAGGCTGGTTCCCGGAGCGGCATACGCCCTCTATGGCGTATGGCCGCTG 

20 CTTCTGCTTCTGCTGGCGTTACCACCACGGGCGTACGCCATGGACCGGGAGATGGCCGCATCGTGCGGAG 
GCGCGGTTTTTGTAGGTCTGGTACTCTTGACCTTGTCACCACACTATAAAGTGTTCCTTGCCAGGTTCAT 
ATGGTGGCTACAATATCTCATCACCAGAACCGAAGCGCATCTGCAAGTGTGGGTCCCCCCTCTCAACGTT 
CGGGGGGGTCGCGATGCCATCATCCTCCTCACATGCGTGGTCCACCCAGAGCTAATCTTTGACATCACAA 
AATATTTGCTCGCCATATTCGGCCCGCTCATGGTGCTCCAGGCCGGCATAACTAGAGTGCCGTACTTCGT 

25 GCGCGCACAAGGGCTCATTCGTGCATGCATGTTGGCGCGGAAAGTCGTGGGGGGTCATTACGTCCAAATG 
GTCTTCATGAAGCTGGCCGCACTAGCAGGTACGTACGTTTATGACCATCTTACTCCACTGCGAGATTGGG 
CTCACACGGGCTTACGAGACCTTGCAGTGGCAGTAGAGCCCGTTGTCTTCTCTGACATGGAGACCAAAGT 
CATCACCTGGGGGGCAGACACCGCGGCGTGCGGGGACATCATCTTGGCCCTGCCTGCTTCCGCCCGAAGG 
GGGAAGGAGATACTTCTGGGACCGGCCGATAGTCTTGAAGGACAGGGGTGGCGACTCCTTGCGCCCATCA 

30 CGGCCTACTCCCAACAAACGCGAGGCCTGCTTGGTTGCATCATCACTAGCCTTACAGGCCGGGACAAGAA 
CCAGGTTGAGGGGGAGGTTCAAGTGGTTTCCACCGCAACACAATCTTTCCTGGCGACCTGCATCAATGGC 
GTGTGTTGGACTGTCTTCCACGGCGCCGGCTCAAAGACCCTAGCCGGCCCAAAGGGTCCAATCACCCAAA 
TGTACACCAATGTAGACCAGGACCTTGTTGGCTGGCCGGCACCTCCTGGGGCGCGTTCCCTGACACCATG 
CACTTGCGGCTCCTCGGACCTTTACCTGGTCACGAGACATGCTGATGTCATTCCGGTGCGCCGGCGGGGT 

35 GACGGTAGGGGGAGCCTACTCCCCCCCAGGCCTGTCTCCTACTTGAAGGGCTCCTCGGGTGGTCCACTGC 
TCTGCCCTTCGGGGCACGCTGTCGGCATACTTCCGGCTGCTGTATGCACCCGGGGGGTTGCCATGGCGGT 
GGAATTCATACCCGTTGAGTCTATGGAAACTACTATGCGGTCTCCGGTCTTCACGGACAATCCGTCTCCC 
CCGGCTGTACCGCAGACATTCCAAGTGGCCCACTTACACGCTCCCACCGGCAGCGGCAAGAGCACTAGGG 
TGCCGGCTGCATATGCAGCCCAAGGGTACAAGGTGCTCGTCCTAAATCCGTCCGTCGCCGCCACCTTGGG 

40 TTTTGGGGCGTATATGTCCAAGGCACATGGTATCGACCCCAACCTTAGAACTGGGGTAAGGACCATCACC 
ACAGGTGCCCCTATCACATACTCCACCTATGGCAAGTTCCTTGCCGACGGTGGCGGCTCCGGGGGCGCCT 
ATGACATCATAATGTGTGATGAGTGCCACTCAACTGACTCGACTACCATTTATGGCATCGGCACAGTCCT 
GGACCAAGCGGAGACGGCTGGAGCGCGGCTCGTGGTGCTCTCCACCGCTACGCCTCCGGGATCGGTCACC 
GTGCCACACCTCAATATCGAGGAGGTGGCCCTGTCTAATACTGGAGAGATCCCCTTCTACGGCA2\AGCCA 

45 TTCCCATCGAGGCTATCAAGGGGGGAAGGCATCTCATTTTCTGCCATTCCAAGAAGAAGTGTGACGAACT 
CGCCGCAAAGCTGTCAGGCCTCGGACTCAATGCCGTAGCGTATTACCGGGGTCTTGACGTGTCCGTCATA 
CCGACCAGCGGAGACGTTGTTGTCGTGGCGACGGACGCTCTAATGACGGGCTTTACCGGCGACTTTGACT 
CAGTGATCGACTGTAATACGTGTGTCACCCAGACAGTCGATTTCAGCTTGGACCCCACCTTCACCATTGA 
GACGACGACCGTGCCCCAAGACGCAGTGTCGCGCTCGCAGAGGCGAGGCAGGACTGGTAGGGGCAGGGCT 

50 GGCATATACAGGTTTGTGACTCCAGGAGAACGGCCCTCGGGCATGTTCGATTCTTCGGTCCTGTGTGAGT 
GTTATGACGCGGGTTGTGCGTGGTACGAACTCACGCCCGCTGAGACCTCGGTTAGGTTGCGGGCGTACCT 
AAACACACCAGGGTTGCCCGTCTGCCAGGACCATCTGGAGTTCTCGGAGGGTGTCTTCACAGGCCTCACC 
CACATAGATGCCCACTTCTTATCCCAGACTAAACAGGCAGGAGAGAACTTCCCCTACTTGGTAGCATACC 
AGGCTACAGTGTGCGCCAGGGCTCAAGCCCCACCTCCATCGTGGGATGAAATGTGGAGGTGTCTCATACG 

55 GCTGAAACCTACGCTGCACGGGCCAACACCCCTGCTGTATAGGTTAGGAGCCGTCCAAAATGAGGTCACC 
CTCACACACCCCATAACCAAATTCATCATGACATGTATGTCGGCTGACCTGGAGGTCGTCACCAGCACCT 
GGGTGCTGGTAGGCGGAGTCCTCGCAGCTCTGGCCGCGTACTGCCTGACAACAGGCAGCGTGGTCATTGT 
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GGGCAGGATCATCCTGTCCGGGAAGCCGGCTATCATCCCCGATAGGGAAGTTCTCTACCAGGAGTTCGAC 
GAGATGGAGGAGTGTGCCTCACACCTCCCTTACTTCGAACAGGGAATGCAGCTCGCCGAGCAATTCAAAC 
AGAAGGCGCTCGGGTTGCTGCAAACAGCCACCAAGCAGGCGGAGGCTGCTGCTCCCGTGGTGGAGTCCAA 
GTGGCGAGCCCTTGAGACCTTCTGGGCGAAGCACATGTGGAACTTCATTAGTGGGATACAGTACTTGGCA 
5 GGCTTGTCCACTCTGCCTGGGAACCCCGCAATACGATCACCGATGGCATTCACAGCCTCCATCACCAGCC 
CGCTCACCACCCAGCATACCCTCTTGTTTAACATCTTGGGGGGATGGGTGGCTGCCCAACTCGCCCCCCC 
CAGCGCTGCCTCAGCTTTCGTGGGCGCCGGCATCGCTGGAGCCGCTGTTGGCACGATAGGCCTTGGGAAG 
GTGCTTGTGGACATTCTGGCAGGTTATGGAGCAGGGGTGGCGGGCGCACTTGTGGCCTTTAAGATCATGA 
GCGGCGAGATGCCTTCAGCCGAGGACATGGTCAACTTACTCCCTGCCATCCTTTCTCCCGGTGCCCTGGT 

10 CGTCGGGATTGTGTGTGCAGCAATACTGCGTCGGCATGTGGGCCCAGGGGAAGGGGCTGTGCAGTGGATG 
AACCGGCTGATAGCGTTCGCCTCGCGGGGTAACCACGTCTCCCCCAGGCACTATGTGCCAGAGAGCGAGC 
CTGCAGCGCGTGTTACCCAGATCCTTTCCAGCCTCACCATCACTCAGCTGTTGAAGAGACTCCACCAGTG 
GATTAATGAGGACTGCTCTACGCCATGCTCCAGCTCGTGGCTAAGGGAGATTTGGGACTGGATCTGCACG 
GTGTTGACTGACTTCAAGACCTGGCTCCAGTCCAAGCTCCTGCCGCGATTACCGGGAGTCCCTTTTTTCT 

15 CATGCCAACGCGGGTATAAGGGAGTCTGGCGGGGGGACGGCATCATGCACACCACCTGCCCATGCGGAGC 
ACAGATCACCGGACACGTCAAAAACGGTTCCATGAGGATCGTTGGGCCTAAAACCTGCAGCAACACGTGG 
TACGGGACATTCCCCATCAACGCGTACACCACGGGCCCCTGCACACCCTCCCCGGCGCCAAACTATTCCA 
AGGCATTGTGGAGAGTGGCCGCTGAGGAGTACGTGGAGGTCACGCGGGTGGGAGATTTTCACTACGTGAC 
GGGCATGACCACTGACAACGTGAAGTGTCCATGCCAGGTTCCGGCCCCCGAATTCTTCACGGAGGTGGAT 

20 GGAGTGCGGTTGCACAGGTACGCTCCGGCGTGCAGACCTCTCCTACGGGAGGAGGTCGTATTCCAGGTCG 
GGCTCCACCAGTACCTGGTCGGGTCACAGCTCCCATGCGAGCCCGAACCGGATGTAGCAGTGCTCACTTC 
CATGCTCACTGACCCCTCCCACATTACAGCAGAGACGGCTAAGCGTAGGCTGGCCAGGGGGTCTCCCCCC 
TCCTTGGCCAGCTCTTCAGCTAGCCAGTTGTCTGCGCCTTCCTTGAAGGCGACATGCACTACCCATCATG 
ACTCCCCGGACGCTGACCTCATTGAGGCCAACCTCTTGTGGCGGCAAGAGATGGGCGGGAACATCACCCG 

25 CGTGGAGTCAGAGAATAAGGTGGTAATCCTGGACTCTTTCGACCCGCTCCGAGCGGAGGATGATGAGGGG 
GAAATATCCGTTCCGGCGGAGATCCTGCGGAAATCCAGGAAATTCCCCCCAGCGCTGCCCATATGGGCGC 
CGCCGGATTACAACCCTCCGCTGCTAGAGTCCTGGAAGGACCCGGACTACGTTCCTCCGGTGGTACACGG 
GTGCCCGTTGCCGCCCACCAAGGCCCCTCCAATACCACCTCCACGGAGGAAGAGGACGGTTGTCCTGACA 
GAATCCACCGTGTCTTCTGCCTTGGCGGAGCTCGCTACTAAGACCTTCGGCAGCTCCGGATCGTCGGCCA 

30 TCGACAGCGGTACGGCGACGGCCCCTCCTGACCAAGCCTCCGGTGACGGCGACAGAGAGTCCGACGTTGA 
GTCGTTCTCCTCCATGCCCCCCCTTGAGGGAGAGCCGGGGGACCCCGATCTCAGCGACGGATCTTGGTCC 
ACCGTGAGCGAGGAGGCTAGTGAGGACGTCGTCTGCTGTTCGATGTCCTACACATGGACAGGCGCCCTGA 
TCACGCCATGCGCTGCGGAGGAAAGCAAGTTGCCCATCAACCCGTTGAGCAATTCTTTGCTACGTCACCA 
CAACATGGTCTATGCTACAACATCCCGCAGCGCAGGCCTGCGGCAGAAGAAGGTCACCTTTGACAGACTG 

35 CAAGTCCTGGACGACCACTACCGGGACGTGCTTAAGGAGATGAAGGCGAAGGCGTCCACAGTTAAGGCTA 
AACTTCTATCTGTAGAAGAAGCCTGCAAACTGACGCCCCCACATTCGGCCAAATCCAAATTTGGCTACGG 
GGCGAAGGACGTCCGGAGCCTATCCAGCAGGGCCGTTACCCACATCCGCTCCGTGTGGAAGGACCTGCTG 
GAAGACACTGAAACACCAATTAGCACTACCATCATGGCAAAAAATGAGGTTTTCTGTGTCCAACCAGAGA 
AGGGAGGCCGCAAGCCAGCTCGCCTTATCGTGTTCCCAGATCTGGGAGTTCGTGTATGCGAGAAGATGGC 

40 CCTTTATGACGTGGTCTCCACCCTTCCTCAGGCCGTGATGGGCTCCTCATACGGATTCCAGTACTCTCCT 
AAGCAGCGGGTCGAGTTCCTGGTGAATACCTGGAAATCAAAGAAATGCCCCATGGGCTTCTCATATGACA 
CCCGCTGTTTTGACTCAACGGTCACTGAGAATGACATCCGTGTTGAGGAGTCAATTTACCAATGTTGTGA 
CTTGGCCCCCGAAGCCAAACTGGCCATAAAGTCGCTCACAGAGCGGCTCTATATCGGGGGTCCCCTGACT 
AATTCAAAAGGGCAGAACTGCGGTTACCGCCGGTGCCGCGCGAGCGGCGTGCTGACGACTAGCTGCGGTA 

45 ATACCCTCACATGTTACCTGAAAGCCACTGCGGCCTGTCGAGCTGCGAAGCTCCGGGACTGCACGATGCT 
CGTGAACGGAGACGACCTTGTCGTTATCTGTGAAAGCGCGGGAACCCAAGAGGATGCGGCGAGCCTACGA 
GTCTTCACGGAGGCTATGACTAGGTACTCTGCCCCCCCTGGGGACCCGCCTCAACCGGAATACGACTTGG 
AGTTGATAACATCATGTTCCTCCAATGTGTCGGTCGCACACGATGCATCTGGTAAAAGGGTGTACTACCT 
CACCCGTGACCCTACCACCCCCCTTGCACGGGCTGCGTGGGAGACAGCTAGACACACTCCAGTCAACTCC 

50 TGGCTAGGCAACATCATCATGTATGCGCGCACCTTATGGGCAAGGATGATTCTGATGACTCATTTCTTCT 
CCATCCTTCTAGCTCAGGAGCAACTTGAAAAAACCCTAGATTGTCAGATCTACGGGGCCTGTTACTCCAT 
TGAACCACTTGATCTACCTCAGATCATTGAGCGACTCCATGGTCTTAGCGCATTTTCACTCCATAGTTAC 
TCTCCAGGCGAGATCAATAGGGTGGCTTCATGCCTCAGAAAACTTGGGGTACCACCCTTGCGAGCCTGGA 
GACATCGGGCCAGAAGTGTCCGCGCTAAGCTACTGTCCCAGGGGGGGAGGGCCGCCACTTGTGGCAAGTA 

55 CCTCTTCAACTGGGCGGTGAGGACCAAGCTCAAACTCACTCCAATCCCAGCCGCGTCCCGGTTGGACTTG 
TCCGGCTGGTTCGTTGCTGGTTACAGCGGGGGAGACATATATCACAGCCTGTCTCGTGCCCGACCCCGCT 
GGTTCATGTTGTGCCTACTCCTACTTTCCGTGGGGGTAGGCATCTACCTGCTCCCCAACCGATGAATGGG 
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GAGCTAAACACTCCAGGCCAATAGGCCGTTTCTC (SEQ ID NO: 6689) 



gi I 329739 I gb I L02836. 1 I HPCCGENOM Hepatitis C China virus complete genome 
5 ATTGGGGGCGACACTCCACCATAGATCACTCCCCTGTGAGGAACTACTGTCTTCACGCAGAAAGCGTCTA 
GCCATGGCGTTAGTATGAGTGTCGTGCAGCCTCCAGGACCCCCCCTCCCGGGAGAGCCATAGTGGTCTGC 
GGAACCGGTGAGTACACCGGAATTGCCAGGACGACCGGGTCCTTTCTTGGATCAACCCGCTCAATGCCTG 
GAGATTTGGGCGTGCCCCCGCGAGACTGCTAGCCGAGTAGTGTTGGGTCGCGAAAGGCCTTGTGGTACTG 
CCTGATAGGGTGCTTGCGAGTGCCCCGGGAGGTCTCGTAGACCGTGCACCATGAGCACGAATCCTAAACC 

10 TCAAAGAAAAACCAAACGTAACACCAACCGCCGCCCACAGGACGTCAAGTTCCCGGGCGGTGGTCAGATC 
GTTGGTGGAGTTTACCTGTTGCCGCGCAGGGGCCCCAGGTTGGGTGTGCGCGCGACTAGGAAGACTTCCG 
AGCGGTCGCAACCTCGTGGAAGGCGACAACCTATCCCCAAGGCTCGCCGACCCGAGGGCAGGACCTGGGC 
TCAGCCCGGGTATCCTTGGCCCCTCTATGGCAATGAGGGCTTTGGGTGGGCAGGATGGCTCCTGTCACCC 
CGCGGCTCCCGGCCTAGTTGGGGCCCCACGGACCCCCGGCGTAGGTCGCGTAATTTGGGTAAGGTCATCG 

15 ATACCCTCACATGCGGCTTCGCCGACCTCATGGGGTACATTCCGCTCGTCGGCGCCCCCTTGGGGGGCGC 
TGCCAGGGCCCTGGCACATGGTGTCCGGGTTCTGGAGGACGGCGTGAACTATGCAACAGGGAATTTGCCC 
GGTTGCTCTTTCTCTATCTTCCTTTTAGCCTTGCTATCCTGTTTGACCACCCCAGCTTCCGCTTACGAAG 
TGCGTAACGTGTCCGGGATATACCATGTCACGAACGACTGCTCCAACTCAAGCATTGTGTATGAGGCAGC 
GGACCTGATCATGCATACCCCTGGGTGCGTGCCCTGCGTTCGGGAAGGCAACTCCTCCCGTTGCTGGGTA 

20 GCGCTCACTCCCACGCTCGCGGCCAGGAACGCCACGATCCCCACTGCGACAGTACGACGGCATGTCGATC 
TGCTCGTTGGGGCGGCTGCTTTCTCTTCCGCCATGTACGTGGGGGATCTCTGCGGATCTGTTTTCCTTGT 
CTCTCAGCTGTTCACCTTCTCGCCTCGCCGGTATGAGACAATACAGGACTGCAATTGCTCAATCTATCCC 
GGCCACGTAACAGGTCACCGCATGGCTTGGGATATGATGATGAACTGGTCGCCTACAACAGCTCTAGTGG 
TGTCGCAGTTACTCCGGATCCCTCAAGCCGTCATGGACATGGTGGTGGGGGCCCACTGGGGAGTCCTGGC 

25 GGGCCTTGCCTACTATGCCATGGTGGGGAATTGGGCTAAGGTTTTGATTGTGATGCTACTCTTCGCCGGC 
GTTGATGGGGATACCTACGCGTCTGGGGGGGCGCAGGGCCGCTCCACCCTCGGGTTCACGTCCCTCTTTA 
CACCTGGGGCCTCTCAGAAGATCCAGCTTATAAATACCAATGGTAGCTGGCATATCAACAGGACTGCCCT 
GAACTGCAATGACTCCCTC2\ATACTGGGTTTCTTGCCGCGCTGTTCTATACACACAGGTTCAACGCGTCC 
GGATGCGCAGAGCGCATGGCCAGCTGCCGCCCCATTGATACATTCGATCAGGGCTGGGGCCCCATCACTT 

30 ATACTGAGCCTGATAGCTCGGACCAGAGGCCTTATTGCTGGCACTACGCGCCTCGAAAGTGCGGCATCGT 
ACCTGCGTCGGAGGTGTGCGGTCCAGTGTATTGTTTCACCCCAAGCCCTGTCGTCGTGGGGACGACCGAT 
CGTTTCGGTGTCCCCACATATAGCTGGGGGGAGAATGAGACAGACGTGCTGCTCCTCAACAACACGCGGC 
CGCCGCAAGGCAACTGGTTTGGCTGTACATGGATGAATGGCACTGGGTTCACCAAGACGTGCGGGGGGCC 
TCCGTGTAACATCGGGGGGGTCGGCAACAACACTTTGACTTGCCCCACGGATTGCTTTCGGAAGCACCCC 

35 GAGGCTACGTATACAAGGTGTGGTTCGGGGCCTTGGCTGACACCTAGGTGCTTAGTTGACTACCCATACA 
GGCTCTGGCACTACCCCTGCACTGTCAACTTTGCCATCTTCAAAGTTAGGATGTATGTGGGGGGCGTGGA 
GCACAGGCTCGATGCTGCATGCAACTGGACTCGAGGAGAGCGCTGTAACTTGGAGGACAGGGATAGATCA 
GAACTCAGCCCGCTGCTACTGTCTACAACAGAGTGGCAGATACTACCCTGCGCCTTCACCACCCTACCGG 
CTCTGTCCACTGGTTTAATCCATCTCCATCAGAACATCGTGGACGTGCAATACCTGTACGGTATAGGGTC 

40 AGCGGTTGCCTCCTTTGCAATTAAATGGGAGTATGTCTTGTTGCTTTTCCTTCTACTAGCAGACGCGCGC 
GTATGTGCCTGCTTGTGGATGATGCTGCTGATAGCCCAGGCCGAGGCCGCCTTAGAGAACCTGGTGGTCC 
TCAATGCGGCGTCCGTGGCCGACGCGCATGGCATCCTCTCCTTCCTTGTGTTCTTTTGTGCCGCCTGGTA 
CATTAAGGGCAGGCTGGTCCCCGGGGCAGCATACGCTTTCTACGGCGTGTGGCCGCTGCTCCTGCTCCTG 
CTGACATTACCACCACGAGCTTACGCCATGGACCGGGAGATGGCTGCATCGTGCGGAGGCGCGGTTTTTG 

45 TAGGTCTGGTATTCCTGACTTTGTCACCATACTACAAGGTGTTCCTCGCTAGGCTCATATGGTGGTTGCA 
ATACTTCCTCACCATAGCCGAGGCGCACCTGCAAGTGTGGATCCCCCCTCTCAACATTCGAGGGGGCCGC 
GATGCCATCATCCTCCTCACGTGTGCi^TCCACCCAGAGTCAATCTTTGACATCACCAAACTCCTGCTCG 
CCACGCTCGGTCCGCTCCTGGTGCTTCAGGCTGGCATAACTAGAGTGCCGTACTTTGTGCGCGCTCATGG 
GCTCATTCGCGCGTGCATGCTATTGCGGAAAGTTGCTGGGGGTCATTATGTCCAAATGGCCTTCATGAAG 

50 CTGGGCGCACTGACAGGTACGTACGTCTATAACCATCTTACTCCGCTGCAGTATTGGCCACGCGCGGGTT 
TACGAGAACTCGCGGTGGCAGTAGAGCCCGTCATCTTCTCTGACATGGAGACCAAGATTATCACCTGGGG 
GGCAGACACTGCAGCGTGTGGAGACATCATCTTGGGTTTACCCGTCTCCGCCCGAAGGGGAAAGGAGATA 
CTCCTGGGGCCGGCCGATAGTCTTGAAGGGCAGGGGTGGCGACTCCTTGCGCCCATCACGGCCTACTCCC 
AACAGACGCGGGGCTTACTTGGTTGCATCATCACTAGCCTCACAGGCCGAGACAAGAACCAGGTCGAGGG 

55 GGAGGTTCAAGTGGTCTCCACCGCAACACAATCTTTCCTGGCGACCTGCATCAACGGTGTGTGTTGGACT 
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GTCTATCATGGCGCCGGCTCAAAAACCTTAGCCGGCCCAAAGGGCCCAATCACCCAAATGTACACCAATG 
TAGACCAGGACCTCGTCGGCTGGCACCGGCCCCCCGGGGCGCGTTCCCTAACACCATGCACCTGCGGCAG 
CTCGGACCTTTACTTGGTCACGAGACATGCTGATGTCATTCCGGTGCGCCGTCGAGGCGACAGTAGGGGG 
AGTTTACTCTCCCCCAGGCCTGTCTCCTACCTGAAGGGCTCGTCGGGGGGCCCACTGCTCTGCCCCTTCG 
5 GGCACGTTGCAGGCATCTTCCGGGCTGCTGTGTGCACCCGGGGGGTTGCGAAGGCGGTGGATTTTATACC 
CGTTGAGACCATGGAAACTACCATGCGGTCCCCGGTCTTCACGGACAACTCATCCCCTCCTGCCGTACCG 
CAGACATTCCAAGTGGCCCATCTACACGCTCCCACTGGCAGCGGCAAAAGCACCAAGGTGCCGGCTGCAT 
ATGCAGCCCAAGGGTACAAGGTACTTGTCTTGAACCCGTCTGTTGCCGCCACTTTAGGTTTTGGGGCGTA 
TATGTCTAAGGCACATGGTGTCGACCCCAACATTAGAACCGGGGTAAGGACCATCACCACGGGCGCCCCC 

10 ATCACATACTCTACCTATGGCAAGTTCCTTGCTGATGGTGGTTGCTCTGGGGGTGCCTATGACATTATAA 
TATGTGATGAGTGCCATTCAACTGACTCGACTACCATCTTGGGCATCGGCACGGTCCTGGACCAAGCGGA 
GACGGCTGGAGCGCGGCTTGTCGTGCTCGCCACCGCTACGCCTCCGGGATCGGTCACCGTGCCACATCCA 
AACATCGAGGAGGTGGCCCTGTCCAATACTGGAGAGATCCCCTTCTATGGTAAAGCCATCCCCATCGAAG 
CCATCAGGGGGGGAAGGCATCTCATTTTCTGCCACTCCAAGAAGAAGTGTGACGAGCTTGCTGCAAAGCT 

15 ATCATCGCTCGGGCTCAACGCTGTGGCGTACTACCGGGGGCTTGATGTGTCCGTCATACCATCTAGCGGA 
GACGTCGTTGTCGTGGCAACGGACGCTCTAATGACGGGCTTTACGGGCGACTTTGACTCAGTGATCGACT 
GTAACACATGTGTTACCCAAACAGTCGATTTCAGCTTGGACCCCACCTTCACCATCGAGACAACGACCGT 
GCCCCAAGACGCGGTGTCGCGCTCGCAGCGGCGAGGTAGGACTGGCAGGGGTAGGGAAGGCATCTACAGG 
TTTGTTACTCCAGGAGAACGGCCCTCGGGCATGTTCGACTCCTCAGTCCTGTGTGAGTGCTATGACGCGG 

20 GCTGTGCTTGGTACGAGCTCACGCCGGCTGAGACCACGGTTAGGTTGCGGGCTTACCTAAATACACCAGG 
GTTGCCCGTCTGCCAGGACCATCTGGAGTTCTGGGAGGGCGTCTTCACAGGTCTCACCCATATAGACGCT 
CACTTTCTGTCCCAGACCAAGCAAGCAGGAGACAACTTCCCCTACCTGGTAGCATACCAAGCTACAGTGT 
GTGCCAAGGCTCAGGCCCCACCTCCATCGTGGGATCAAATGTGGAAGTGCCTCACACGGCTAAAGCCTAC 
GCTGCAGGGACCAACACCCCTGCTGTATAGGCTAGGAGCCGTCCAAAATGAGGTCACCCTCACACACCCC 

25 ATAACTAAATACATCATGACATGCATGTCGGCTGACCTGGAGGTCGTCACCAGCACCTGGGTGCTGGTGG 
GCGGAGTCCTTGCAGCTCTGGCCGCGTATTGCCTGACAACGGGCAGCGTGGTCATTGTGGGTAGGATTGT 
CTTGTCCGGAAGTCCGGCTATTGTTCCTGACAGGGAAGTTCTTTACCAAGACTTCGACGAGATGGAAGAG 
TGTGCCTCACACCTCCCTTACATCGAACAGGGAATGCAGCTCGCCGAGCAGTTCAAGCAGAAGGCGCTCG 
GGTTGCTGCAAACAGCCACCAAGCAAGCGGAGGCTGCTGCTCCCGTGGTGGAGTCCAAGTGGCGAGCCCT 

30 CGAGACATTTTGGGAAAAACACATGTGGAATTTCATCAGCGGGATACAGTACTTAGCAGGCTTATCCACT 
CTGCCTGGGAACCCCGCAATGGCATCACTGATGGCATTCACAGCTTCTATCACCAGCCCGCTCACTACCC 
AACACACCCTCCTGTTTAACATCTTGGGTGGATGGGTGGCTGCCCAACTCGCTCCCCCCAGCGCCGCTTC 
GGCCTTTGTGGGCGCCGGCATTGCCGGTGCGGCTGTTGGCAGCATAGGCCTTGGGAAGGTGCTTGTGGAC 
ATCCTGGCGGGTTATGGGGCGGGGGTGGCTGGCGCACTCGTGGCCTTTAAGGTCATGAGTGGCGAAATGC 

35 CCTCCACTGAGGACCTGGTTAATTTACTCCCTGCCATCCTCTCTCCTGGTGCCCTAGTCGTCGGGGTCGT 
GTGCGCAGCAATACTGCGCCGACACGTGGGCCCGGGAGAGGGGGCTGTGCAGTGGATGAACCGGCTGATA 
GCGTTCGCTTCGCGGGGTAACCATGTCTCCCCCACGCACTATGTGCCTGAAAGTGACGCCGCAGCGCGTG 
TTACCCAGATCCTCTCCAGCCTTACCATCACTCAGCTGCTGAAAAGACTTCACCAGTGGATTAATGAGGA 
CTGTTCCACACCATGCTCCGGCTCGTGGCTAAGGGATGTTTGGGATTGGATATGCACGGTGTTGACCGAT 

40 TTCAAGACCTGGCTCCAGTCCAAGCTCCTGCCGCGGTTGCCCGGAGTCCCTTTCCTCTCATGCCAACGCG 
GGTACAAGGGAGTCTGGCGGGGGGACGGTATTATGCAAACCACCTGTCCATGTGGAGCACAGATTACTGG 
ACATGTCAAAAACGGTTCCATGAGAATCGTTGGGCCTAAGACTTGTAGCAACACGTGGCATGGAACATTC 
CCCATCAACGCGTACACCACGGGCCCCTGCACACCCTCCCCGGCGCCGAACTATTCCAGGGCGCTGTGGC 
GGGTGGCTCCTGAGGAGTACGTGGAGGTTACGCGGGTGGGGGATTTCCACTACGTGACGGGCATGACCAC 

45 CGACAACGTGAAATGCCCATGCCAAGTCCCGGCCCCTGAATTCTTCACGGAGGTGGATGGAGTACGGCTG 
CACAGGTACGCTCCGGCGTGCAAACCTCTCCTACGGGAGGAGGTCGTGTTCCAGGTCGGGCTCAACCAAT 
ACCTGGTTGGATCACAGCTCCCATGCGAGCCCGAGCCGGACGTAACAGTGCTCACTTCCATGCTTACCGA 
CCCCTCCCACATCACAGCAGAGACGGCCAAGCGTAGGCTGGCCAGGGGGTCTCCCCCCTCCTTGGCCAGC 
TCTTCAGCTAGCCAATTGTCTGCGCCTTCTTTGAAGGCGACATGTACTACCCATCATGACTCCCCGGACG 

50 CCGACCTCATTGAGGCCAACCTCCTGTGGCGGCAGGAGATGGGCGGAAACATCACCCGTGTGGAGTCAGA 
AAATAAGGTAGTGATCCTGGACTCTTTCGACCCGCTTCGGGCGGAGGAGGACGAGAGGGAAGTATCCGTT 
GCGGCGGAGATCCTGCGGAAATCCAGGAAGTTCCCCTCAGCGCTGCCCATATGGGCACGCCCAGACTACA 
ACCCTCCACTGCTAGAGTCCTGGAAGGACCCAGATTATGTCCCTCCGGTGGTACACGGGTGCCCGTTGCC 
GCCTACCACGGCCCCTCCAGTACCACCTCCACGGAGAAAAAGGACGGTCGTCCTAACAGAGTCATCCGTG 

55 TCTTCTGCCTTGGCGGAGCTCGCTACTAAGACCTTCGGCAGCTCTGAATCGTCGGCCGTCGACAGCGGCA 
CGGCGACTGCCCCTCCTGACGAGGCCTCCGGCGGCGGCGACAAAGGATCCGACGTTGAGTCGTACTCCTC 
CATGCCCCCCCTTGAGGGAGAGCCGGGGGACCCCGACCTCAGCGACGGGTCCTGGTCTACCGTGAGTGAG 
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GAGGCCAGTGAGGACGTCGTCTGCTGCTCAATGTCCTATACATGGACAGGCGCCTTGATCACGCCATGTG 
CTGCGGAGGAGAGCAAGCTGCCCATCAACCCGCTGAGCAACTCCTTGCTGCGTCACCACAACATGGTCTA 
TGCTACAACATCCCGCAGTGCAAGCCTACGGCAGAAGAAGGTCGCTTTTGACAGAATGCAAGTCCTGGAC 
GACCACTACCGGGACGTGCTCAAGGAGATGAAGGCGAAGGCGTCCACAGTTAAGGCTAAACTCCTATCCA 
5 TAGAAGAGGCCTGCAAGCTGACGCCCCCACATTCAGCCAAATCCAAATTTGGCTATGGGGCAAAAGACGT 
CCGGAACCTATCCAGCAAGGCCGTTAACCACATCCGCTCCGTGTGGAAGGACTTGTTGGAAGACAATGAG 
ACACCAATCAATACCACCATCATGGCAAAAAATGAGGTTTTCTGCGTCCAACCAGAGAAAGGAGGCCGTA 
AGCCAGCTCGCCTTATCGTATTCCCAGACTTGGGAGTCCGTGTGTGCGAGAAGATGGCCCTTTATGACGT 
GGTCTCCACCCTTCCTCAGCCCGTGATGGGCTCCTCATACGGATTCCAGTACTCTCCTGGGCAGCGGGTC 

10 GAATTCCTGCTAAATGCCTGGAAATCAAAGGAAAACCCTATGGGCTTCTCATATGACACCCGCTGTTTTG 
ACTCAACGGTCACTCAGAACGACATCCGTGTTGAGGAGTCAATTTACCAATGTTGTGACTTGGCCCCCGA 
GGCCAGACGGGCCATAAAGTCGCTCACAGAGCGGCTCTATATCGGGGGTCCCCTGACTAATTCAAAAGGG 
CAGAACTGCGGTTATCGCCGGTGCCGCGCAAGTGGCGTGCTGACGACCAGCTGCGGTAATACCCTTACAT 
GTTACTTGAAGGCCTCTGCGGCCTGTCGAGCTGCGAAGCTGCAGGACTGCACGATGCTCGTGAACGGAGA 

15 CGACCTTGTCGTTATCTGTGAAAGCGCGGGAACTCAAGAGGATGCGGCGAGCCTACGAGTCTTCACGGAG 
GCTATGACTAGGTACTCTGCCCCCCCTGGGGACCTGCCCCAACCAGAATACGACTTGGAGCTAATAACAT 
CATGCTCCTCCAATGTGTCAGTCGCCCACGATGCATCTGGCAAAAGGGTGTACTACCTCACCCGTGACCC 
CACCATCCCCCTCGCGCGGGCTGCGTGGGAGACAGCTAGACACACTCCAGTCAACTCCTGGCTAGGCAAC 
ATCATCATGTATGCGCCCACTCTATGGGCAAGGATGATTCTGATGACTCACTTCTTCTCCATCCTTCTAG 

20 CTCAGGAGCAACTTGAGAAAGCCCTGGATTGCCAAATCTACGGGGCCTACTACTCCATTGAGCCACTTGA 
CCTACCTCAGATCATTGAACGACTCCATGGCCTTAGCGCATTTTCACTCCATAGTTACTCTCCAGGTGAG 
ATCAATAGGGTGGCGTCATGTCTCAGGAAACTTGGGGTACCACCCTTGCGAGTCTGGAGACATCGGGCCA 
GAAGCGTCCGCGCTAAGCTACTGTCCCAGGGGGGGAGGGCCGCCACTTGTGGCAAGTACCTCTTCAACTG 
GGCAGTAAAGACCAAGCTTAAACTCACTCCAATCCCGGCTGCGTCCCGGTTGGACTTGTCCGGCTGGTTC 

25 GTTGCTGGTTACAGCGGGGGAGACATATATCACAGCCTGTCTCGTGCCCGACCCCGTTGGTTCATGTTGT 
GCCTACTCCTACTTTCTGTAGGGGTAGGCATCTACCTGCTCCCCAACCGATGAACGGGGAGATAAACACT 
CCAGGCCAATAGGCCATCCC (SEQ ID NO: 6690) 



30 gi I 15422182 I gb| AY051292 . 1 1 Hepatitis C virus from India polyprotein mRNA, 
complete cds 

GCCAGCCCCCTGATGGGGGCGACACTCCACCATAGATCACTCCCCTGTGAGGAACTACTGTCTTCACGCA 
GAAAGCGTCTAGCCATGGCGTTAGTATGAGTGTCGTGCAGCCTCCAGGACCCCCCCTCCCGGGAGAGCCA 
TAGTGGTCTGCGGAACCGGTGAGTACACCGGAATTGCCAGGACGACCGGGTCCTTTCTTGGATCAACCCG 

35 CTCAATGCCTGGAGATTTGGGCGTGCCCCCGCAAGACTGCTAGCCGAGTAGTGTTGGGTCGCGAAAGGCC 
TTGTGGTACTGCCTGATAGGGTGCTTGCGAGTGCCCCGGGAGGTCTCGTAGACCGTGCACCATGAGCACG 
AATCCTAAACCTCAAAG7VAAAACCAAACGTAACACCAACCGACGCCCACAGAACGTTAAGTTCCCGGGTG 
GCGGCCAGATCGTTGGCGGAGTTTGCTTGTTGCCGCGCAGGGGTCCCAGAGTGGGTGTGCGCGCGACGAG 
GAAGACTTCCGAGCGGTCACAACCTCGCGGAAGGCGTCAGCCTATTCCCAAGGCCCGCCGACCCGAGGGC 

40 AGGTCCTGGGCGCAGCCCGGGTACCCTTGGCCCCTCTATGGCAACGAGGGCTGTGGGTGGGCAGGATGGC 
TCTTGTCCCCCCGCGGCTCCCGGCCTAGTCGGGGCCCCTCTGACCCCCGGCGCAGGTCACGCAATTTGGG 
TAAGGTCATCGATACCCTCACGTGTGGCTTCGCCGACCTCATGGGGTACATCCCGCTCGTCGGTGCTCCT 
CTAGGGGGCGCTGCTAGGGCTCTGGCACATGGTGTTAGGGTTCTAGAAGACGGCGTAAATTACGCAACAG 
GGAACCTTCCTGGTTGCTCTTTTTCTATCTTCTTGCTTGCTCTTCTCTCCTGCTTGACAGTCCCTGCTTC 

45 GGCCGTCGAAGTGCGCAACTCTTCGGGGATCTACCATGTCACCAATGATTGCCCCAATGCGTCTGTTGTG 
TACGAGACAGATAGCTTGATCATACATCTGCCCGGGTGTGTGCCCTGCGTACGCGAGGGCAACGCTTCGA 
GGTGCTGGGTCTCCCTTAGTCCTACTGTTGCCGCTAAGGATCCGGGCGTCCCCGTCAACGAGATTCGGCG 
TCACGTCGACCTGATTGTCGGGGCCGCTGCATTCTGTTCGGCTATGTATGTAGGGGACTTATGCGGTTCC 
ATCTTCCTCGTTGGCCAGCTTTTCACCCTCTCCCCTAGGCGCCACTGGACAACACAAGACTGTAATTGCT 

50 CCATCTACCCAGGACATGTGACAGGCCATCGAATGGCTTGGGACATGATGATGAATTGGTCACCTACTGG 
CGCTTTGGTGGTAGCGCAGCTACTCCGGATCCCACAAGCCGTCTTGGATATGATAGCCGGTGCCCACTGG 
GGTGTCCTAGCGGGCCCGGCATACTACTCCATGGTGGGGAACTGGGCTAAGGTTTTGGTTGTGCTACTGC 
TCTTCGCTGGCGTCGATGCAACCACCCAAGTCACAGGTGGCACCGCGGGCCGTAATGCATATAGATTGGC 
TAGCCTCTTCTCCACCGGCCCCAGCCAATyiLTATCCAGCTCATAAACTCCAATGGCAGCTGGCACATTAAC 

55 AGGACTGCCCTGAATTGCAATGACAGCCTGCACACCGGCTGGGTAGCAGCGCTGTTCTACTCCCACAAGT 



347 



wo 2004/080406 



PCT/US2004/007070 



TCAACTCTTCGGGGCGTCCTGAGAGGATGGCTAGTTGTCGGCCTCTTACCGCCTTCGACCAAGGGTGGGG 
GCCCATCACTTACGGGGGGAAAGCTAGTAACGACCAGCGGCCGTATTGCTGGCACTATGCCCCACGCCCG 
TGCGGTATCGTGCCGGCGAAAGAGGTTTGCGGGCCTGTATACTGTTTCACACCCAGTCCCGTGGTAGTGG 
GGACGACGGACAAGTACGGCGTTCCTACCTACACATGGGGCGAGAATGAGACGGATGTACTGCTCCTTAA 
5 CAACTCTAGGCCGCCAATAGGGAATTGGTTCGGGTGTACGTGGATGAATTCCACTGGTTTCACCAAGACG 
TGCGGGGCTCCTGCCTGTAACGTCGGCGGGAGCGAGACCAACACCCTGTCGTGCCCCACAGATTGCTTCC 
GCAGACATCCGGACGCAACATACGCTAAGTGCGGCTCTGGCCCTTGGCTTAACCCTCGATGCATGGTGGA 
CTACCCTTACAGGCTCTGGCACTATCCCTGCACAGTCAATTACACCATATTCAAGATCAGGATGTTCGTG 
GGCGGGATTGAGCACAGGCTCACCGCCGCGTGCAACTGGACGCGGGGAGAGCGCTGCGACTTGGACGACA 

10 GGGATCGTGCCGAGTTGAGCCCGCTGTTGCTGTCCACCACGCAATGGCAGGTCCTCCCCTGCTCATTCAC 
AACGCTGCCCGCCCTGTCAACTGGCCTAATACATCTCCACCAGAACATCGTGGACGTGCAGTACCTCTAC 
GGGTTGAGCTCGGTAGTTACATCCTGGGCCATAAGGTGGGAGTATGTCGTGCTCCTTTTCTTGCTGTTAG 
CAGATGCCCGCATTTGTGCCTGCCTTTGGATGATGCTTCTCATATCCCAGGTAGAGGCGGCGCTGGAGAA 
CCTGATAGTCCTCAACGCTGCTTCCCTGGCTGGGACACACGGCATCGTCCCTTTCTTCATCTTTTTTTGT 

15 GCAGCCTGGTATCTGAAAGGCAAGTGGGCCCCTGGACTCGTCTACTCCGTCTACGGAATGTGGCCGCTGC 
TCCTGCTTCTCCTGGCGTTGCCCCAACGGGCGTACGCCTTGGATCAGGAGTTGGCCGCGTCGTGTGGGGC 
CGTGGTCTTCATCAGCCTAGCGGTACTTACCCTGTCGCCGTACTACAAACAGTACATGGCCCGCGGCATC 
TGGTGGCTGCAGTACATGCTGACCAGAGCGGAGGCGCTCCTGCACGTCTGGGTCCCCTCGCTCAACGCCC 
GGGGAGGGCGTGATGGTGCCATACTGCTCATGTGTGTGCTCCACCCGCACTTGCTCTTTGACATCACCAA 

20 AATCATGCTGGCCATTCTCGGGCCCCTGTGGATCTTGCAGGCCAGTCTGCTCAGGGTGCCGTACTTCGTG 
CGCGCCCACGGTCTCATTAGGCTCTGCATGCTGGTGCGCAAAACAGCGGGCGGTCACTATGTGCAGATGG 
CTCTGTTGAAGCTGGGGGCACTTACTGGCACTTACATTTACAACCACCTTTCCCCACTCCAAGACTGGGC 
TCATGGCAGCTTGCGTGATCTAGCGGTGGCCACCGAGCCCGTCATCTTCTCCCGGATGGAGATCAAGACT 
ATCACCTGGGGGGCAGACACCGCGGCCTGTGGAGACATCATCAACGGGCTGCCTGTTTCTGCTCGGAGGG 

25 GGAGAGAGGTGTTGTTGGGACCAGCCGATGCCCTGACTGACAAGGGATGGAGGCTTTTAGCCCCCATCAC 
AGCTTACGCCCAACAGACACGAGGTCTCTTGGGCTGTATTGTCACCAGCCTCACCGGTCGGGACAAAAAT 
CAAGTGGAGGGGGAAATCCAGATTGTGTCTACCGCAACCCAGACGTTCTTGGCCACTTGCATCAACGGAG 
CTTGCTGGACTGTTTATCATGGGGCCGGATCGAGGACCATCGCTTCGGCGTCGGGTCCTGTGGTCCGGAT 
GTACACCAATGTGGACCAGGATTTGGTGGGCTGGCCAGCGCCTCAGGGAGCGCGCTCCCTGACGCCGTGC 

30 ACGTGCGGTGCCTCGGATCTGTACTTGGTCACGAGGCACGCGGATGTCATCCCAGTGCGGCGTCGAGGCG 
ATAACAGGGGAAGCTTGCTTTCTCCCCGGCCCATCTCATACCTAAAAGGATCCTCGGGAGGCCCTCTGCT 
CTGCCCCATGGGACATGTCGCGGGCATTTTTAGGGCCGCGGTGTGCACCCGTGGGGTTGCAAAGGCGGTC 
GACTTTGTGCCCGTTGAGTCCTTAGAGACCACCATGAGGTCCCCAGTGTTTACTGACAATTCCAGCCCTC 
CAACAGTGCCCCAGAGTTACCAGGTGGCACATCTACATGCACCCACTGGGAGTGGCAAGAGCACGAAGGT 

35 GCCGGCCGCTTACGCAGCTCAAGGGTACAAGGTACTTGTGCTGAACCCGTCTGTTGCTGCCACCTTAGGG 
TTCGGTGCTTATATGTCAAAGGCCCATGGGATTGACCCAAACGTCAGGACCGGCGTGAGGACCATTACCA 
CAGGCTCCCCCATCACCTACTCCACCTACGGGAAATTTTTGGCTGATGGCGGATGCCCAGGAGGTGCGTA 
CGACATCATAATATGTGACGAATGTCACTCAGTGGACGCCACCTCGATTCTGGGCATAGGGACCGTCTTG 
GACCAAGCGGAGACGGCGGGGGTTAGGCTCACTGTCCTTGCCACCGCTACACCACCTGGCTTGGTCACCG 

40 TGCCACATTCCAACATCGAGGAAGTTGCACTGTCCGCTGACGGGGAGAAACCATTTTATGGTAAGGCCAT 
CCCCCTAAACTACATCAAGGGGGGGAGGCATCTCATTTTCTGTCATTCCAAGAAGAAGTGCGACGAGCTC 
GCTGCAAAGCTGGTCGGTCTGGGCGTCAACGCGGTGGCCTTTTACCGTGGCCTCGACGTATCTGTCATTC 
CAACTACAGGAGACGTCGTTGTTGTAGCGACCGACGCCTTGATGACTGGCTTCACCGGCGATTTCGACTC 
TGTGATAGACTGCAACACCTGTGTCGTCCAGACAGTCGACTTCAGCCTAGACCCTATATTCTCTATTGAG 

45 ACTTCCACCGTGCCCCAGGACGCCGTGTCCCGCTCCCAACGGAGGGGTAGGACCGGTCGAGGG7\AGCATG 
GTATTTACAGATATGTGTCACCCGGGGAGCGGCCGTCTGGCATGTTCGACTCCGTGGTCCTCTGTGAGTG 
CTATGACGCGGGTTGTGCTTGGTACGAGCTTACACCCGCCGAGACCACAGTCAGGCTACGGGCATACCTT 
AACACCCCAGGATTGCCCGTGTGCCAGGACCACTTGGAGTTCTGGGAGAGTGTCTTCACCGGCCTCACCC 
ACATAGATGCCCACTTCCTGTCCCAGACGAAACAGAGTGGGGAGAACTTCCCCTACCTAGTCGCATACCA 

50 AGCCACCGTGTGCGCTAGAGCTAGAGCTCCTCCCCCGTCATGGGACCAAATGTGGAAGTGCCTGATACGG 
CTCAAGCCCACCCTCACTGGGGCTACCCCATTACTATACAGACTGGGTAGTGTACAGAATGAGATCACCT 
TAACACACCCAATCACCC7VATACATCATGGCTTGCATGTCGGCGGACCTGGAGGTCGTCACTAGCACGTG 
GGTGTTGGTGGGCGGCGTCCTAGCCGCTTTGGCCGCTTACTGCCTGTCCACAGGCAGCGTGGTCATAGTG 
GGCAGGATAATCCTAGGTGGGAAGCCGGCAGTCATACCTGACAGGGAGGTTCTCTACCGAGAGTTTGATG 

55 AGATGGAGGAGTGCGCCGCCCACGTCCCCTACCTCGAGCAGGGGATGCATTTGGCTGGACAGTTCAAGCA 
GAAAGCTCTCGGGTTGCTCCAGACAGCATCCAAGCAAGCGGAGACGATCACTCCCACTGTCCGCACCAAC 
TGGCAGAAACTCGAGTCCTTCTGGGCTAAGCACATGTGGAACTTCGTTAGCGGGATACAATACCTGGCGG 
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GCCTGTCAACGCTGCCCGGGAACCCCGCTATAGCGTCGCTGATGTCGTTTACGGCCGCGGTGACGAGTCC 
ACTAACCACCCAGCAAACCCTCTTCTTTAACATCTTAGGGGGGTGGGTGGCGGCCCAGCTTGCTTCCCCA 
GCTGCCGCTACTGCTTTTGTCGGTGCTGGTATTACTGGCGCCGTTGTTGGCAGTGTGGGCCTAGGGAAGG 
TCCTAGTGGACATTATTGCTGGCTACGGGGCTGGTGTGGCGGGGGCCCTCGTGGCTTTCAAAATCATGAG 
5 CGGGGAGACCCCCACCACCGAGGATCTAGTCAACCTTCTGCCTGCCATCCTATCGCCAGGAGCTCTCGTT 
GTCGGCGTGGTGTGCGCAGCAATACTACGCCGGCACGTGGGCCCTGGCGAGGGCGCCGTGCAGTGGATGA 
ACCGGCTGATAGCGTTTGCTTCTCGGGGTAACCACGTCTCCCCTACACACTACGTGCCGGAGAGCGACGC 
GTCGGCTCGTGTCACACAAATTCTCACCAGCCTCACTGTTACTCAGCTTCTGAAAAGGCTCCACGTGTGG 
ATAAGCTCGGATTGCATCGCCCCGTGTGCTAGTTCTTGGCTTAAAGATGTCTGGGACTGGATATGCGAGG 

10 TGCTGAGCGACTTCAAGAATTGGCTGAAGGCC2\AACTTGTACCACAACTGCCCGGGATCCCATTCGTATC 
CTGCCAACGCGGGTACCGTGGGGTCTGGCGGGGCGAGGGCATCGTGCACACTCGTTGCCCGTGTGGGGCC 
AATATAACTGGACATGTCAAGAACGGTTCGATGAGAATCGTCGGGCCTAAGACTTGCAGCAACACCTGGC 
GTGGGTCGTTCCCCATTAACGCTTACACTACAGGCCCGTGCACGCCCTCCCCGGCGCCGAACTATACGTT 
CGCGCTATGGAGGGTGTCTGCAGAGGAGTATGTGGAGGTAAGGCGGCTGGGGGACTTCCATTACGTCACG 

15 GGGGTGACCACTGATAAACTCAAGTGTCCATGCCAGGTCCCCTCACCCGAGTTCTTCACAGAGGTGGACG 
GGGTGCGCCTGCATAGGTACGCCCCCCCCTGCAAACCCCTGCTGCGAGAAGAGGTGACGTTTAGCATCGG 
GCTCAATGAATACTTGGTGGGGTCCCAGTTGCCCTGCGAGCCCGAGCCAGACGTAGCTGTACTGACATCA 
ATGCTTACAGACCCCTCCCACATCACTGCAGAGACGGCAGCGCGTAGGCTGAAGCGGGGGTCTCCCCCCT 
CCCTGGCCAGCTCTTCCGCCAGCCAGCTGTCCGCGCCGTCACTGAAGGCAACATGCACCACTCACCACGA 

20 CTCTCCAGACGCTGACCTCATAGAAGCCAACCTCCTGTGGAGACAGGAGATGGGGGGGAACATCACTAGG 
GTGGAGTCGGAGAACAAGATTGTCGTTCTGGATTCTTTCGACCCGCTCGTAGCGGAGGAGGATGATCGGG 
AGATCTCTATTCCAGCTGAGATTCTGCGGAAGTTCAAGCAGTTTCCTCCCGCTATGCCCATATGGGCACG 
GCCAGATTATAATCCTCCCCTTGTGGAACCGTGGAAGCGCCCGGACTATGAGCCACCCTTAGTCCACGGG 
TGCCCCCTACCACCTCCCAAGCCAACTCCGGTGCCGCCACCCCGGAGAAAGAGGACGGTGGTGCTGGACG 

25 AGTCTACAGTATCATCTGCTCTGGCTGAGCTTGCCACTAAGACCTTCGGCAGCTCTACAACCTCAGGCGT 
GACAAGTGGTGAAGCGACTGAATCGTCCCCGGCGCCCTCCTGCGGCGGTGAGCTGGACTCCGAAGCTGAA 
TCTTACTCCTCCATGCCCCCTCTCGAGGGGGAGCCGGGGGACCCCGATCTCAGCGACGGGTCTTGGTCTA 
CCGTGAGCAGTGATGGTGGCACGGAAGACGTTGTGTGCTGCTCGATGTCTTACTCGTGGACGGGCGCTTT 
AATCACGCCCTGTGCCTCAGAGGAAGCCAAGCTCCCTATCAACGCATTGAGCAACTCGCTGCTGCGCCAC 

30 CACAACTTGGTGTATTCCACCACCTCTCGCAGCGCTGGCCAGAGACAGAAAAAAGTCACATTTGACAGAG 
TGCAAGTCCTGGACGACCATTACCGGGACGTGCTCAAGGAGGCTAAGGCCAAGGCATCCACGGTGAAGGC 
TAGACTGCTATCCGTTGAGGAAGCGTGTAGCCTGACGCCCCCACACTCCGCCAGATCAAAATTTGGCTAT 
GGGGCGAAGGATGTCCGAAGCCATTCCAGTAAGGCTATACGCCACATCAACTCCGTGTGGCAGGACCTTC 
TGGAGGACAATACAACACCCATAGACACTACCATCATGGCAAAGAATGAGGTCTTCTGTGTGAAGCCCGA 

35 AAAGGGGGGCCGCAAGCCCGCTCGTCTTATCGTGTACCCCGACCTGGGAGTGCGCGTATGCGAGAAGAGG 
GCTTTGTATGACGTAGTCAAACAGCTCCCCATTGCCGTGATGGGAGCCTCCTACGGGTTCCAGTACTCAC 
CAGCGCAGCGGGTCGACTTCCTGCTTAAAGCGTGGAAATCTAAGAAAGTCCCCATGGGGTTTTCCTATGA 
CACCCGTTGCTTTGACTCAACAGTCACTGAGGCTGATATCCGTACGGAGGAAGACCTCTACCAATCTTGT 
GACCTGGCCCCTGAGGCTCGCATAGCCATAAGGTCCCTCACAGAGAGGCTTTACATCGGGGGCCCACTCA 

40 CCAATTCTAAGGGACAAAACTGCGGCTATCGGCGATGCCGCGCAAGCGGCGTGCTGACCACTAGCTGCGG 
TAACACCATAACCTGCTTCCTCAAAGCCAGTGCAGCCTGTCGAGCTGCGAAGCTCCAGGACTGCACCATG 
CTCGTGTGCGGCGACGACCTCGTCGTTATCTGTGAGAGCGCCGGTGTCCAGGAGGACGCTGCGAGCCTGA 
GAGCCTTCACGGAGGCTATGACCAGGTACTCCGCCCCCCCGGGAGACCCGCCTCAACCAGAATACGACTT 
GGAGCTTATAACATCCTGCTCCTCCAATGTGTCGGTCGCGCGCGACGGCGCTGGCAAAAGGGTCTATTAT 

45 CTGACCCGTGACCCTGAGACTCCCCTCGCGCGTGCCGCTTGGGAGACAGCAAGACACACTCCAGTGAACT 
CCTGGCTAGGCAACATCATCATGTTTGCCCCCACTCTGTGGGTACGGATGGTCCTCATGACCCATTTTTT 
CTCCATACTCATAGCTCAGGAGCACCTTGGAAAGGCTCTAGATTGTGAAATCTATGGAGCCGTACACTCC 
GTCCAACCGTTGGACTTACCTGAAATCATCCAAAGACTCCACAGCCTCAGCGCGTTTTCGCTCCACAGTT 
ACTCTCCAGGTGAAATCAATAGGGTGGCTGCATGCCTCAGGAAGCTTGGGGTTCCGCCCTTGCGAGCTTG 

50 GAGACACCGGGCCCGGAGCGTTCGCGCCACACTCCTATCCCAGGGGGGGAAAGCCGCTATATGCGGTAAG 
TACCTCTTCAACTGGGCGGTGAAAACCAAACTCAAACTCACTCCATTACCGTCCATGTCTCAGTTGGACT 
TGTCCAACTGGTTCACGGGCGGTTACAGCGGGGGAGACATTTATCACAGCGTGTCTCATGCCCGGCCCCG 
TTTGTTCCTCTGGTGCCTACTCCTACTTTCAGTAGGGGTAGGCATCTATCTCCTTCCCAACCGATAGACG 
GNTGGGCAACCACTCCGGGTCTTTAGGCCCTATTTAAACACTCCAGGCCTTTAGGCCCCGT 

55 (SEQ ID NO: 6691) 
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gi I 23510419 1 ref 1NM_000043 . 3 I Homo sapiens tumor necrosis factor receptor 

superfamily, member 6 (TNFRSF6) , transcript variant 1, mRNA 

CCTACCCGCGCGCAGGCCAAGTTGCTGAATCAATGGAGCCCTCCCCAACCCGGGCGTTCCCCAGCGAGGC 

TTCCTTCCCATCCTCCTGACCACCGGGGCTTTTCGTGAGCTCGTCTCTGATCTCGCGCAAGAGTGACACA 

CAGGTGTTCAAAGACGCTTCTGGGGAGTGAGGGAAGCGGTTTACGAGTGACTTGGCTGGAGCCTCAGGGG 

CGGGCACTGGCACGGAACACACCCTGAGGCCAGCCCTGGCTGCCCAGGCGGAGCTGCCTCTTCTCCCGCG 

GGTTGGTGGACCCGCTCAGTACGGAGTTGGGGAAGCTCTTTCACTTCGGAGGATTGCTCAACAACCATGC 

TGGGCATCTGGACCCTCCTACCTCTGGTTCTTACGTCTGTTGCTAGATTATCGTCCAAAAGTGTTAATGC 

CCAAGTGACTGACATCAACTCCAAGGGATTGGAATTGAGGAAGACTGTTACTACAGTTGAGACTCAGAAC 

TTGGAAGGCCTGCATCATGATGGCCAATTCTGCCATAAGCCCTGTCCTGCAGGTGAAAGGAAAGCTAGGG 

ACTGCACAGTCAATGGGGATGAACCAGACTGCGTGCCCTGCCAAGAAGGGAAGGAGTACACAGACAAAGC 

CCATTTTTCTTCCAAATGCAGAAGATGTAGATTGTGTGATGAAGGACATGGCTTAGAAGTGGAAATAAAC 

TGCACCCGGACCCAGAATACCAAGTGCAGATGTAAACCAAACTTTTTTTGTAACTCTACTGTATGTGAAC 

ACTGTGACCCTTGCACCAAATGTGAACATGGAATCATCAAGGAATGCACACTCACCAGCAACACCAAGTG 

CAAAGAGGAAGGATCCAGATCTAACTTGGGGTGGCTTTGTCTTCTTCTTTTGCCAATTCCACTAATTGTT 

TGGGTGAAGAGAAAGGAAGTACAGAAAACATGCAGAAAGCACAGAAAGGAAAACCAAGGTTCTCATGAAT 

CTCCAACCTTAAATCCTGAAACAGTGGCAATAAATTTATCTGATGTTGACTTGAGTAAATATATCACCAC 

TATTGCTGGAGTCATGACACTAAGTCAAGTTAAAGGCTTTGTTCGAAAGAATGGTGTCAATGAAGCCAAA 

ATAGATGAGATCAAGAATGACAATGTCCAAGACACAGCAGAACAGAAAGTTCAACTGCTTCGTAATTGGC 

ATCAACTTCATGGAAAGAAAGAAGCGTATGACACATTGATTAAAGATCTCAAAAAAGCCAATCTTTGTAC 

TCTTGCAGAGT^AAATTCAGACTATCATCCTCAAGGACATTACTAGTGACTCAGAAAATTCAAACTTCAGA 

AATGAAATCCAAAGCTTGGTCTAGAGTGA7VAAACAACAAATTCAGTTCTGAGTATATGCAATTAGTGTTT 

GAAAAGATTCTTAATAGCTGGCTGTAAATACTGCTTGGTTTTTTACTGGGTACATTTTATCATTTATTAG 

CGCTGAAGAGCCAACATATTTGTAGATTTTTAATATCTCATGATTCTGCCTCCAAGGATGTTTAAAATCT 

AGTTGGGAAAACAAACTTCATCAAGAGTAAATGCAGTGGCATGCTAAGTACCCAAATAGGAGTGTATGCA 

GAGGATGAAAGATTAAGATTATGCTCTGGCATCTAACATATGATTCTGTAGTATGAATGTAATCAGTGTA 

TGTTAGTACAAATGTCTATCCACAGGCTAACCCCACTCTATGAATC7\ATAGAAGAAGCTATGACCTTTTG 

CTGAAATATCAGTTACTGAACAGGCAGGCCACTTTGCCTCTAAATTACCTCTGATAATTCTAGAGATTTT 

ACCATATTTCTAAACTTTGTTTATAACTCTGAGAAGATCATATTTATGTAAAGTATATGTATTTGAGTGC 

AGAATTTAAATAAGGCTCTACCTCAAAGACCTTTGCACAGTTTATTGGTGTCATATTATACAATATTTCA 

ATTGTGAATTCACATAGAAAACATTAAATTATAATGTTTGACTATTATATATGTGTATGCATTTTACTGG 

CTCAAAACTACCTACTTCTTTCTCAGGCATCAAAAGCATTTTGAGCAGGAGAGTATTACTAGAGCTTTGC 

CACCTCTCCATTTTTGCCTTGGTGCTCATCTTAATGGCCTAATGCACCCCCAAACATGGAAATATCACCA 

AA7VAATACTTAATAGTCCACCAAAAGGCAAGACTGCCCTTAGAAATTCTAGCCTGGTTTGGAGATACTAA 

CTGCTCTCAGAGAAAGTAGCTTTGTGACATGTCATGAACCCATGTTTGCAATCAAAGATGAT7\AAATAGA 

TTCTTATTTTTCCCCCACCCCCGAAAATGTTCAATAATGTCCCATGTAAAACCTGCTACAAATGGCAGCT 

TATACATAGCAATGGTAAAATCATCATCTGGATTTAGGAATTGCTCTTGTCATACCCCCAAGTTTCTAAG 

ATTTAAGATTCTCCTTACTACTATCCTACGTTTAAATATCTTTGAAAGTTTGTATTAAATGTGAATTTTA 

AGAAATAATATTTATATTTCTGTAAATGTAAACTGTGAAGATAGTTATAAACTGAAGCAGATACCTGGAA 

CCACCTAAAGAACTTCCATTTATGGAGGATTTTTTTGCCCCTTGTGTTTGGAATTATAAAATATAGGTAA 

AAG T ACG T AAT T AAAT AAT GT T T T T GGT AAAAAAAZVAAAAAAAAAAAAAAAAAAAAAAAAAAA^^ 

A^^AAAAAAAAAAAAAAAAAAAAAAA (SEQ ID NO: 6692) 



gi|35910|emblX12387.1lHSRCYP3 Human mRNA for cytochrome P-450 (cyp3 locus) 

GAATTCCCAAAGAGCAACACAGAGCTGAAAGGAAGACTCAGAGGAGAGAGATAAGTAAGGAAAGTAGTGA 

TGGCTCTCATCCCAGACTTGGCCATGGAAACCTGGCTTCTCCTGGCTGTCAGCCTGGTGCTCCTCTATCT 

ATATGGAACCCATTCACATGGACTTTTTAAGAAGCTTGGAATTCCAGGGCCCACACCTCTGCCTTTTTTG 

GGAAATATTTTGTCCTACCATAAGGGCTTTTGTATGTTTGACATGGAATGTCATAAAAAGTATGGAAAAG 

TGTGGGGCTTTTATGATGGTCAACAGCCTGTGCTGGCTATCACAGATCCTGACATGATCAAAACAGTGCT 

AGTGAAAGAATGTTATTCTGTCTTCACAAACCGGAGGCCTTTTGGTCCAGTGGGATTTATGAAAAGTGCC 

ATCTCTATAGCTGAGGATGAAGAATGGAAGAGATTACGATCATTGCTGTCTCCAACCTTCACCAGTGGAA 

AACTCAAGGAGATGGTCCCTATCATTGCCCAGTATGGAGATGTGTTGGTGAGAAATCTGAGGCGGGAAGC 

AGAGACAGGCAAGCCTGTCACCTTGAAAGACGTCTTTGGGGCCTACAGCATGGATGTGATCACTAGCACA 

TCATTTGGAGTGAACATCGACTCTCTCAACT^TCCACAAGACCCCTTTGTGGAAAACACCAAGAAGCTTT 
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TAAGATTTGATTTTTTGGATCCATTCTTTCTCTCAATAACAGTCTTTCCATTCCTCATCCCAATTCTTGA 
AGTATTAAATATCTGTGTGTTTCCAAGAGAAGTTACAAATTTTTTAAGAAAATCTGTAAAAAGGATGAAA 
GAAAGTCGCCTCGAAGATACACAAAAGCACCGAGTGGATTTCCTTCAGCTGATGATTGACTCTCAGAATT 
CAAAAGAAACTGAGTCCCACAAAGCTCTGTCCGATCTGGAGCTCGTGGCCCAATCAATTATCTTTATTTT 
5 TGCTGGCTATGAAACCACGAGCAGTGTTCTCTCCTTCATTATGTATGAACTGGCCACTCACCCTGATGTC 
CAGCAGAAACTGCAGGAGGAAATTGATGCAGTTTTACCCAATAAGGCACCACCCACCTATGATACTGTGC 
TACAGATGGAGTATCTTGACATGGTGGTGAATGAAACGCTCAGATTATTCCCAATTGCTATGAGACTTGA 
GAGGGTCTGCAAAAAAGATGTTGAGATCAATGGGATGTTCATTCCCAAAGGGTGGGTGGTGATGATTCCA 
AGCTATGCTCTTCACCGTGACCCAAAGTACTGGACAGAGCCTGAGAAGTTCCTCCCTGAAAGATTCAGCA 

10 AGAAGAACAAGGACAACATAGATCCTTACATATACACACCCTTTGGAAGTGGACCCAGAAACTGCATTGG 
CATGAGGTTTGCTCTCATGAACATGAAACTTGCTCTAATCAGAGTCCTTCAGAACTTCTCCTTCAAACCT 
TGTAAAGAAACACAGATCCCCCTGAAATTAAGCTTAGGAGGACTTCTTCAACCAGAAAAACCCGTTGTTC 
TAAAGGTTGAGTCAAGGGATGGCACCGTAAGTGGAGCCTGAATTTTCCTAAGGACTTCTGCTTTGCTCTT 
CAAGAAATCTGTGCCTGAGAACACCAGAGACCTCAAATTACTTTGTGAATAGAACTCTGAAATGAAGATG 

15 GGCTTCATCCAATGGACTGCATAAATAACCGGGGATTCTGTACATGCATTGAGCTCTCTCATTGTCTGTG 
TAGAGTGTTATACTTGGGAATATAAAGGAGGTGACCA7VATCAGTGTGAGGAGGTAGATTTGGCTCCTCTG 
CTTCTCACGGGACTATTTCCACCACCCCCAGTTAGCACCATTAACTCCTCCTGAGCTCTGATAAGAGAAT 
C AAC AT T T C T C AAT AAT T T C CT C C ACAAAT T AT T AAT GAAAAT AAG AAT T AT T T T GAT GGC T C T AAC AAT 
GACATTTATATCACATGTTTTCTCTGGAGTATTCTATAGTTTTATGTTAAATCAATAAAGACCACTTTAC 

20 AAAAGTATTATCAGATGCTTTCCTGCACATTAAGGAGAATCTATAGAACTGAATGAGAACCAACAAGTAA 
ATATTTTTGGTCATTGTAATCACTGTTGGCGTGGGGCCTTTGTCAGAACTAGAATTTGATTATTAACATA 
GGTGAAAGTTAATCCACTGTGACTTTGCCCATTGTTTAGAAAGAATATTCATAGTTTAATTATGCCTTTT 
TTGATCAGGCACATGGCTCACGCCTGTAATCCTAGCAGTTTGGGAGGCTGAGCCGGGTGGATCGCCTGAG 
GTCAGGAGTTCAAGACAAGCCTGGCCTACATGGTGAAACCCCATCTCTACTAAAAATACACAAATTAGCT 

25 AGGCATGGTGGACTCGCCTGTAATCTCACTACACAGGAGGCTGAGGCAGGAGAATCACTTGAACCTGGGA 
GGCGGATGTTGAAGTGAGCTGAGATTGCACCACTGCACTCCAGTCTGGGTGAGAGTGAGACTCAGTCTTA 
AAAAAATATGCCTTTTTGAAGCACGTACATTTTGTAACAAAGAACTGAAGCTCTTATTATATTATTAGTT 
TTGATTTAATGTTTTCAGCCCATCTCCTTTCATATTTCTGGGAGACAGAAAACATGTTTCCCTACACCTC 
TTGCTTCCATCCTCAACACCCAACTGTCTCGATGCAATGAACACTTAATAAAAAACAGTCGATTGGTCAA 

30 AAAAAAAAAAAAAAAAAAAAAAAGAATTC (SEQ ID NO: 6693) 



gi ! 33954 9 I gblM19154 . 1 1 HUMTGFB2A Human transforming growth f actor-beta-2 
mRNA, complete cds 

35 GCCCCTCCCGTCAGTTCGCCAGCTGCCAGCCCCGGGACCTTTTCATCTCTTCCCTTTTGGCCGGAGGAGC 
CGAGTTCAGATCCGCCACTCCGCACCCGAGACTGACACACTGAACTCCACTTCCTCCTCTTAAATTTATT 
TCTACTTAATAGCCACTCGTCTCTTTTTTTCCCCATCTCATTGCTCCAAGAATTTTTTTCTTCTTACTCG 
CCAAAGTCAGGGTTCCCTCTGCCCGTCCCGTATTAATATTTCCACTTTTGGAACTACTGGCCTTTTCTTT 
TTAAAGGAATTCAAGCAGGATACGTTTTTCTGTTGGGCATTGACTAGATTGTTTGCAAAAGTTTCGCATC 

40 AAAAACAACAACAACAAAAAACCAAACAACTCTCCTTGATCTATACTTTGAGAATTGTTGATTTCTTTTT 
TTTATTCTGACTTTTAAAAACAACTTTTTTTTCCACTTTTTTAAAAAATGCACTACTGTGTGCTGAGCGC 
TTTTCTGATCCTGCATCTGGTCACGGTCGCGCTCAGCCTGTCTACCTGCAGCACACTCGATATGGACCAG 
TTCATGCGCAAGAGGATCGAGGCGATCCGCGGGCAGATCCTGAGCAAGCTGAAGCTCACCAGTCCCCCAG 
AAGACTATCCTGAGCCCGAGGAAGTCCCCCCGGAGGTGATTTCCATCTACAACAGCACCAGGGACTTGCT 

45 CCAGGAGAAGGCGAGCCGGAGGGCGGCCGCCTGCGAGCGCGAGAGGAGCGACGAAGAGTACTACGCCAAG 
GAGGTTTACAAAATAGACATGCCGCCCTTCTTCCCCTCCGAAACTGTCTGCCCAGTTGTTACAACACCCT 
CTGGCTCAGTGGGCAGCTTGTGCTCCAGACAGTCCCAGGTGCTCTGTGGGTACCTTGATGCCATCCCGCC 
CACTTTCTACAGACCCTACTTCAGAATTGTTCGATTTGACGTCTCAGCAATGGAGAAGAATGCTTCCAAT 
TTGGTGAAAGCAGAGTTCAGAGTCTTTCGTTTGCAGAACCCAAAAGCCAGAGTGCCTGAACAACGGATTG 

50 AGCTATATCAGATTCTCAAGTCCAAAGATTTAACATCTCCAACCCAGCGCTACATCGACAGCAAAGTTGT 
GAAAACAAGAGCAGAAGGCGAATGGCTCTCCTTCGATGTAACTGATGCTGTTCATGAATGGCTTCACCAT 
AAAGACAGGAACCTGGGATTTAAAATAAGCTTACACTGTCCCTGCTGCACTTTTGTACCATCTAATAATT 
ACATCATCCCA2\ATAAAAGTGAAGAACTAGAAGCAAGATTTGCAGGTATTGATGGCACCTCCACATATAC 
CAGTGGTGATCAGAAAACTATAAAGTCCACTAGGAAAAAAAACAGTGGGAAGACCCCACATCTCCTGCTA 

55 ATGTTATTGCCCTCCTACAGACTTGAGTCACAACAGACCAACCGGCGGAAGAAGCGTGCTTTGGATGCGG 
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CCTATTGCTTTAGAAATGTGCAGGATAATTGCTGCCTACGTCCACTTTACATTGATTTCAAGAGGGATCT 
AGGGTGGAAATGGATACACGAACCCAAAGGGTACAATGCCAACTTCTGTGCTGGAGCATGCCCGTATTTA 
TGGAGTTCAGACACTCAGCACAGCAGGGTCCTGAGCTTATATAATACCATAAATCCAGAAGCATCTGCTT 
CTCCTTGCTGCGTGTCCCAAGATTTAGAACCTCTAACCATTCTCTACTACATTGGCAAAACACCCAAGAT 
5 TGAACAGCTTTCTAATATGATTGTAAAGTCTTGCAAATGCAGCTAAAATTCTTGGAAAAGTGGCAAGACC 
AAAATGACAATGATGATGATAATGATGATGACGACGACAACGATGATGCTTGTAACAAGAAAACATAAGA 
GAGCCTTGGTTCATCAGTGTTAAAAAATTTTTGAAAAGGCGGTACTAGTTCAGACACTTTGGAAGTTTGT 
GTTCTGTTTGTTAAAACTGGCATCTGACACAAAAAAAGTTGAAGGCCTTATTCTACATTTCACCTACTTT 
GT AAGT GAG AGAG ACAAGAAGCAAATT T T T T T T AAAGAAAAAAAT AAACACT GGAAGAAT T T AT T AG T GT 

10 TAATTATGTGAACAACGACAACAACAACAACAACAACAAACAGGAAAATCCCATTAAGTGGAGTTGCTGT 
ACGTACCGTTCCTATCCCGCGCCTCACTTGATTTTTCTGTATTGCTATGCAATAGGCACCCTTCCCATTC 
TTACTCTTAGAGTTAACAGTGAGTTATTTATTGTGTGTTACTATATAATGAACGTTTCATTGCCCTTGGA 
AAATAAAACAGGTGTATAAAGTGGAGACCAAATACTTTGCCAGAAACTCATGGATGGCTTAAGGAACTTG 
AACTCAAACGAGCCAGAAAAAAAGAGGTCATATTAATGGGATGAAAACCCAAGTGAGTTATTATATGACC 

15 GAGAAAGTCTGCATTAAGATAAAGACCCTGAAAACACATGTTATGTATCAGCTGCCTAAGGAAGCTTCTT 
GTAAGGTCCAAAAACTAAAAAGACTGTTAATAAAAGAAACTTTCAGTCAG (SEQ ID NO: 6694) 



gi I 18 6624 I gb 1 J04111 . 1 1 HUMJUNA Human c-jun proto oncogene (JON), complete 

20 cds, clone hCJ-1 

CCCGGGGAGGGGACCGGGGAACAGAGGGCCGAGAGGCGTGCGGCAGGGGGGAGGGTAGGAGAAAGAAGGG 
CCCGACTGTAGGAGGGCAGCGGAGCATTACCTCATCCCGTGAGCCTCCGCGGGCCCAGAGAAGAATCTTC 
TAGGGTGGAGTCTCCATGGTGACGGGCGGGCCCGCCCCCCTGAGAGCGACGCGAGCCAATGGGAAGGCCT 
TGGGGTGACATCATGGGCTATTTTTAGGGGTTGACTGGTAGCAGATAAGTGTTGAGCTCGGGCTGGATAA 

25 GGGCTCAGAGTTGCACTGAGTGTGGCTGAAGCAGCGAGGCGGGAGTGGAGGTGCGCGGAGTCAGGCAGAC 
AGACAGACACAGCCAGCCAGCCAGGTCGGCAGTATAGTCCGAACTGCAAATCTTATTTTCTTTTCACCTT 
CTCTCTAACTGCCCAGAGCTAGCGCCTGTGGCTCCCGGGCTGGTGGTTCGGGAGTGTCCAGAGAGCCTTG 
TCTCCAGCCGGCCCCGGGAGGAGAGCCCTGCTGCCCAGGCGCTGTTGACAGCGGCGGAAAGCAGCGGTAC 
CCCACGCGCCCGCCGGGGGACGTCGGCGAGCGGCTGCAGCAGCAAAGAACTTTCCCGGCGGGGAGGACCG 

30 GAGACAAGTGGCAGAGTCCCGGAGCGAACTTTTGCAAGCCTTTCCTGCGTCTTAGGCTTCTCCACGGCGG 
TAAAGACCAGAAGGCGGCGGAGAGCCACGCAAGAGAAGAAGGACGTGCGCTCAGCTTCGCTCGCACCGGT 
TGTTGAACTTGGGCGAGCGCGAGCCGCGGCTGCCGGGCGCCCCCTCCCCCTAGCAGCGGAGGAGGGGACA 
AGTCGTCGGAGTCCGGGCGGCCAAGACCCGCCGCCGGCCGGCCACTGCAGGGTCCGCACTGATCCGCTCC 
GCGGGGAGAGCCGCTGCTCTGGGAAGTGAGTTCGCCTGCGGACTCCGAGGAACCGCTGCGCCCGAAGAGC 

35 GCTCAGTGAGTGACCGCGACTTTTCAAAGCCGGGTAGCGCGCGCGAGTCGACAAGTAAGAGTGCGGGAGG 
CATCTTAATTAACCCTGCGCTCCCTGGAGCGAGCTGGTGAGGAGGGCGCAGCGGGGACGACAGCCAGCGG 
GTGCGTGCGCTCTTAGAGAAACTTTCCCTGTCAAAGGCTCCGGGGGGCGCGGGTGTCCCCCGCTTGCCAG 
AGCCCTGTTGCGGCCCCGAAACTTGTGCGCGCACGCCAAACTAACCTCACGTGAAGTGACGGACTGTTCT 
ATGACTGCAAAGATGGAAACGACCTTCTATGACGATGCCCTCAACGCCTCGTTCCTCCCGTCCGAGAGCG 

40 GACCTTATGGCTACAGTAACCCCAAGATCCTGAAACAGAGCATGACCCTGAACCTGGCCGACCCAGTGGG 
GAGCCTGAAGCCGCACCTCCGCGCCAAGAACTCGGACCTCCTCACCTCGCCCGACGTGGGGCTGCTCAAG 
CTGGCGTCGCCCGAGCTGGAGCGCCTGATAATCCAGTCCAGCAACGGGCACATCACCACCACGCCGACCC 
CCACCCAGTTCCTGTGCCCCAAGAACGTGACAGATGAGCAGGAGGGGTTCGCCGAGGGCTTCGTGCGCGC 
CCTGGCCGAACTGCACAGCCAGAACACGCTGCCCAGCGTCACGTCGGCGGCGCAGCCGGTCAACGGGGCA 

45 GGCATGGTGGCTCCCGCGGTAGCCTCGGTGGCAGGGGGCAGCGGCAGCGGCGGCTTCAGCGCCAGCCTGC 
ACAGCGAGCCGCCGGTCTACGCAAACCTCAGCAACTTCAACCCAGGCGCGCTGAGCAGCGGCGGCGGGGC 
GCCCTCCTACGGCGCGGCCGGCCTGGCCTTTCCCGCGCAACCCCAGCAGCAGCAGCAGCCGCCGCACCAC 
CTGCCCCAGCAGATGCCCGTGCAGCACCCGCGGCTGCAGGCCCTGAAGGAGGAGCCTCAGACAGTGCCCG 
AGATGCCCGGCGAGACACCGCCCCTGTCCCCCATCGACATGGAGTCCCAGGAGCGGATCAAGGCGGAGAG 

50 GAAGCGCATGAGGAACCGCATCGCTGCCTCCAAGTGCCGAAAAAGGAAGCTGGAGAGAATCGCCCGGCTG 
GAGGAAAAAGTGAAAACCTTGAAAGCTCAGAACTCGGAGCTGGCGTCCACGGCCAACATGCTCAGGGAAC 
AGGTGGCACAGCTTAAACAGAAAGTCATGAACCACGTTAACAGTGGGTGCCAACTCATGCTAACGCAGCA 
GTTGCAAACATTTTGAAGAGAGACCGTCGGGGGCTGAGGGGCAACGAAGAAAAAAAATAACACAGAGAGA 
CAGACTTGAGAACTTGACAAGTTGCGACGGAGAGAAAAAAGAAGTGTCCGAGAACTAAAGCCAAGGGTAT 

55 CCAAGTTGGACTGGGTTCGGTCTGACGGCGCCCCCAGTGTGCACGAGTGGGAAGGACTTGGTCGCGCCCT 
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CCCTTGGCGTGGAGCCAGGGAGCGGCCGCCTGCGGGCTGCCCCGCTTTGCGGACGGGCTGTCCCCGCGCG 

AACGGAACGTTGGACTTTCGTTAACATTGACCAAGAACTGCATGGACCTAACATTCGATCTCATTCAGTA 

TTAAAGGGGGGAGGGGGAGGGGGTTACAAACTGCAATAGAGACTGTAGATTGCTTCTGTAGTACTCCTTA 

AGAACACAAAGCGGGGGGAGGGTTGGGGAGGGGCGGCAGGAGGGAGGTTTGTGAGAGCGAGGCTGAGCCT 

ACAGATGAACTCTTTCTGGCCTGCTTTCGTTAACTGTGTATGTACATATATATATTTTTTAATTTGATTA 

AAGCTGATTACTGTCAATAAACAGCTTCATGCCTTTGTAAGTTATTTCTTGTTTGTTTGTTTGGGTATCC 

TGCCCAGTGTTGTTTGTAAATAAGAGATTTGGAGCACTCTGAGTTTACCATTTGTAATAAAGTATATAAT 

TTTTTTATGTTTTGTTTCTGAAAATTCCAGAAAGGATATTTAAGAAAATACAATAAACTATTGGAAAGTA 

CTCCCCTAACCTCTTTTCTGCATCATCTGTAGATCCTAGTCTATCTAGGTGGAGTTGAAAGAGTTAAGAA 

TGCTCGATAAAATCACTCTCAGTGCTTCTTACTATTAAGCAGTAAAAACTGTTCTCTATTAGACTTAGAA 

ATAAATGTACCTGATGTACCTGATGCTATGTCAGGCTTCATACTCCACGCTCCCCCAGCGTATCTATATG 

GAATTGCTTACCAAAGGCTAGTGCGATGTTTCAGGAGGCTGGAGGAAGGGGGGTTGCAGTGGAGAGGGAC 

AGCCCACTGAGAAGTCAAACATTTCAAAGTTTGGATTGCATCAAGTGGCATGTGCTGTGACCATTTATAA 

TGTTAGAAATTTTACAATAGGTGCTTATTCTCAAAGCAGGAATTGGTGGCAGATTTTACAAAAGATGTAT 

CCTTCCAATTTGGAATCTTCTCTTTGACAATTCCTAGATAAAAAGATGGCCTTTGTCTTATGAATATTTA 

T AAC AG CAT T C T G T C AC AAT AAAT G TAT T C AAAT AC C AAT AAC AG AT C T T GAAT TGCTTCCCTTTAC TAG 

TTTTTTGTTCCCAAGTTATATACTGAAGTTTTTATTTTTAGTTGCTGAGGTT (SEQ ID NO: 6695) 



gi|17 9982|gb|M5772 9.1|HUMCCC5 Human complement component 05 mRNA, complete 
cds 

CTACCTCCAACCATGGGCCTTTTGGGAATACTTTGTTTTTTAATCTTCCTGGGGAAAACCTGGGGACAGG 

AGCAAACATATGTCATTTCAGCACCAAAAATATTCCGTGTTGGAGCATCTGAAAATATTGTGATTCAAGT 

TTATGGATACACTGAAGCATTTGATGCAACAATCTCTATTAAAAGTTATCCTGATAAAAAATTTAGTTAC 

TCCTCAGGCCATGTTCATTTATCCTCAGAGAATAAATTCCAAAACTCTGCAATCTTAACAATACAACCAA 

AACAATTGCCTGGAGGACAAAACCCAGTTTCTTATGTGTATTTGGAAGTTGTATCAAAGCATTTTTCAAA 

ATCAAAAAGAATGCCAATAACCTATGACAATGGATTTCTCTTCATTCATACAGACAAACCTGTTTATACT 

CCAGACCAGTCAGTAAAAGTTAGAGTTTATTCGTTGAATGACGACTTGAAGCCAGCCAAAAGAGAAACTG 

TCTTAACCTTCATAGATCCTGAAGGATCAGAAGTTGACATGGTAGAAGAAATTGATCATATTGGAATTAT 

CTCTTTTCCTGACTTCAAGATTCCGTCTAATCCTAGATATGGTATGTGGACGATCAAGGCTAAATATAAA 

GAGGACTTTTCAACAACTGGAACCGCATATTTTGAAGTTAAAGAATATGTCTTGCCACATTTTTCTGTCT 

CAATCGAGCCAGAATATAATTTCATTGGTTACAAGAACTTTAAGAATTTTGAAATTACTATAAAAGCAAG 

ATATTTTTATAATAAAGTAGTCACTGAGGCTGACGTTTATATCACATTTGGAATAAGAGAAGACTTAAAA 

GATGATCAAAAAGAAATGATGCAAACAGCAATGCAAAACACAATGTTGATAAATGGAATTGCTCAAGTCA 

CATTTGATTCTGAAACAGCAGTCAAAGAACTGTCATACTACAGTTTAGAAGATTTAAACAACAAGTACCT 

TTATATTGCTGTAACAGTCATAGAGTCTACAGGTGGATTTTCTGAAGAGGCAGAAATACCTGGCATCAAA 

TATGTCCTCTCTCCCTACAAACTGAATTTGGTTGCTACTCCTCTTTTCCTGAAGCCTGGGATTCCATATC 

CCATCAAGGTGCAGGTTAAAGATTCGCTTGACCAGTTGGTAGGAGGAGTCCCAGTAATACTGAATGCACA 

AACAATTGATGTAAACCAAGAGACATCTGACTTGGATCCAAGCAAAAGTGTAACACGTGTTGATGATGGA 

GTAGCTTCCTTTGTGCTTAATCTCCCATCTGGAGTGACGGTGCTGGAGTTTAATGTCAAAACTGATGCTC 

CAGATCTTCCAGAAGAAAATCAGGCCAGGGAAGGTTACCGAGCAATAGCATACTCATCTCTCAGCCAAAG 

TTACCTTTATATTGATTGGACTGATAACCATAAGGCTTTGCTAGTGGGAGAACATCTGAATATTATTGTT 

ACCCCCAAAAGCCCATATATTGACTVAAATAACTCACTATAATTACTTGATTTTATCCAAGGGCAAT^TTA 

TCCATTTTGGCACGAGGGAGAAATTTTCAGATGCATCTTATCAAAGTATAAACATTCCAGTAACACAGAA 

CATGGTTCCTTCATCCCGACTTCTGGTCTATTATATCGTCACAGGAGAACAGACAGCAGAATTAGTGTCT 

GATTCAGTCTGGTTAAATATTGAAGAAAAATGTGGCAACCAGCTCCAGGTTCATCTGTCTCCTGATGCAG 

ATGCATATTCTCCAGGCCAAACTGTGTCTCTTAATATGGCAACTGGAATGGATTCCTGGGTGGCATTAGC 

AGCAGTGGACAGTGCTGTGTATGGAGTCCAAAGAGGAGCCAAAAAGCCCTTGGAAAGAGTATTTCAATTC 

TTAGAGAAGAGTGATCTGGGCTGTGGGGCAGGTGGTGGCCTCAACAATGCCAATGTGTTCCACCTAGCTG 

GACTTACCTTCCTCACTAATGCAAATGCAGATGACTCCCAAGAAAATGATGAACCTTGTAAAGAAATTCT 

CAGGCCAAGAAGAACGCTGCAAAAGAAGATAGAAGAAATAGCTGCTAAATATAAACATTCAGTAGTGAAG 

AAATGTTGTTACGATGGAGCCTGCGTTAATAATGATGAAACCTGTGAGCAGCGAGCTGCACGGATTAGTT 

TAGGGCCAAGATGCATCAAAGCTTTCACTGAATGTTGTGTCGTCGCAAGCCAGCTCCGTGCTAATATCTC 

TCATAAAGACATGCAATTGGGAAGGCTACACATGAAGACCCTGTTACCAGTAAGCAAGCCAGAAATTCGG 

AGTTATTTTCCAGAAAGCTGGTTGTGGGAAGTTCATCTTGTTCCCAGAAGAAAACAGTTGCAGTTTGCCC 



353 



wo 2004/080406 



PCT/US2004/007070 



TACCTGATTCTCTAACCACCTGGGAAATTCAAGGCATTGGCATTTCAAACACTGGTATATGTGTTGCTGA 
TACTGTCAAGGCAAAGGTGTTCAAAGATGTCTTCCTGGAAATGAATATACCATATTCTGTTGTACGAGGA 
GAACAGATCCAATTGAAAGGAACTGTTTACAACTATAGGACTTCTGGGATGCAGTTCTGTGTTAAAATGT 
CTGCTGTGGAGGGAATCTGCACTTCGGAAAGCCCAGTCATTGATCATCAGGGCACAAAGTCCTCCAAATG 
5 TGTGCGCCAGAAAGTAGAGGGCTCCTCCAGTCACTTGGTGACATTCACTGTGCTTCCTCTGGAAATTGGC 
CTTCACAACATCAATTTTTCACTGGAGACTTGGTTTGGAAAAGAAATCTTAGTAAAAACATTACGAGTGG 
TGCCAGAAGGTGTCAAAAGGGAAAGCTATTCTGGTGTTACTTTGGATCCTAGGGGTATTTATGGTACCAT 
TAGCAGACGAAAGGAGTTCCCATACAGGATACCCTTAGATTTGGTCCCCAAAACAGAAATCAAAAGGATT 
TTGAGTGTAAAAGGACTGCTTGTAGGTGAGATCTTGTCTGCAGTTCTAAGTCAGGAAGGCATCAATATCC 

10 TAACCCACCTCCCCAAAGGGAGTGCAGAGGCGGAGCTGATGAGCGTTGTCCCAGTATTCTATGTTTTTCA 
CTACCTGGAAACAGGAAATCATTGGAACATTTTTCATTCTGACCCATTAATTGAAAAGCAGAAACTGAAG 
AAAAAATTAAAAGAAGGGATGTTGAGCATTATGTCCTACAGAAATGCTGACTACTCTTACAGTGTGTGGA 
AGGGTGGAAGTGCTAGCACTTGGTTAACAGCTTTTGCTTTAAGAGTACTTGGACAAGTAAATAAATACGT 
AGAGCAGAACCAAAATTCAATTTGTAATTCTTTATTGTGGCTAGTTGAGAATTATCAATTAGATAATGGA 

15 TCTTTCAAGGAAAATTCACAGTATCAACCAAT7VAAATTACAGGGTACCTTGCCTGTTGAAGCCCGAGAGA 
ACAGCTTATATCTTACAGCCTTTACTGTGATTGGAATTAGAAAGGCTTTCGATATATGCCCCCTGGTGAA 
AATCGACACAGCTCTAATTAAAGCTGACAACTTTCTGCTTGAAAATACACTGCCAGCCCAGAGCACCTTT 
ACATTGGCCATTTCTGCGTATGCTCTTTCCCTGGGAGATAAAACTCACCCACAGTTTCGTTCAATTGTTT 
CAGCTTTGAAGAGAGAAGCTTTGGTTAAAGGTAATCCACCCATTTATCGTTTTTGGAAAGACAATCTTCA 

20 GCATAAAGACAGCTCTGTACCTAACACTGGTACGGCACGTATGGTAGAAACAACTGCCTATGCTTTACTC 
ACCAGTCTGAACTTGAAAGATATAAATTATGTTAACCCAGTCATCAAATGGCTATCAGAAGAGCAGAGGT 
ATGGAGGTGGCTTTTATTCAACCCAGGACACCATCAATGCCATTGAGGGCCTGACGGAATATTCACTCCT 
GGTTAAACAACTCCGCTTGAGTATGGACATCGATGTTTCTTACAAGCATAAAGGTGCCTTACATAATTAT 
AAAATGACAGACAAGAATTTCCTTGGGAGGCCAGTAGAGGTGCTTCTCAATGATGACCTCATTGTCAGTA 

25 CAGGATTTGGCAGTGGCTTGGCTACAGTACATGTAACAACTGTAGTTCACAAAACCAGTACCTCTGAGGA 
AGTTTGCAGCTTTTATTTGAAAATCGATACTCAGGATATTGAAGCATCCCACTACAGAGGCTACGGAAAC 
TCTGATTACAAACGCATAGTAGCATGTGCCAGCTACAAGCCCAGCAGGGAAGAATCATCATCTGGATCCT 
CTCATGCGGTGATGGACATCTCCTTGCCTACTGGAATCAGTGCAAATGAAGAAGACTTAAAAGCCCTTGT 
GGAAGGGGTGGATCAACTATTCACTGATTACCAAATCAAAGATGGACATGTTATTCTGCAACTGAATTCG 

30 ATTCCCTCCAGTGATTTCCTTTGTGTACGATTCCGGATATTTGAACTCTTTGAAGTTGGGTTTCTCAGTC 
CTGCCACTTTCACAGTTTACGAATACCACAGACCAGATAAACAGTGTACCATGTTTTATAGCACTTCCAA 
TATCAAAATTCAGAAAGTCTGTGAAGGAGCCGCGTGCAAGTGTGTAGAAGCTGATTGTGGGCAAATGCAG 
GAAGAATTGGATCTGACAATCTCTGCAGAGACAAGAAAACAAACAGCATGTAAACCAGAGATTGCATATG 
CTTATAAAGTTAGCATCACATCCATCACTGTAGAAAATGTTTTTGTCAAGTACAAGGCAACCCTTCTGGA 

35 TATCTACAT^AACTGGGGAAGCTGTTGCTGAGAAAGACTCTGAGATTACCTTCATTAAAAAGGTAACCTGT 
ACTAACGCTGAGCTGGTAAAAGGAAGACAGTACTTAATTATGGGTAAAGAAGCCCTCCAGATA7\7U\TACA 
ATTTCAGTTTCAGGTACATCTACCCTTTAGATTCCTTGACCTGGATTGAATACTGGCCTAGAGACACAAC 
ATGTTCATCGTGTCAAGCATTTTTAGCTAATTTAGATGAATTTGCCGAAGATATCTTTTTAAATGGATGC 
TAAAATTCCTGAAGTTCAGCTGCATACAGTTTGCACTTATGGACTCCTGTTGTTGAAGTTCGTTTTTTTG 

40 TTTTCTTCTTTTTTTAAACATTCATAGCTGGTCTTATTTGTAAAGCTCACTTTACTTAGAATTAGTGGCA 
CTTGCTTTTATTAGAGAATGATTTCAAATGCTGTAACTTTCTGAAATAACATGGCCTTGGAGGGCATGAA 
GACAGATACTCCTCCAAGGTTATTGGACACCGGAAACAATAAATTGGAACACCTCCTCAAACCTACCACT 
CAGGAATGTTTGCTGGGGCCGAAAGAACAGTCCATTGAAAGGGAGTATTACAAAAACATGGCCTTTGCTT 
GAAAGAAAATACCAAGGAACAGGAAACTGATCATTAAAGCCTGAGTTTGCTTTC (SEQ ID NO: 6696) 

45 



gi I 189944 I gbl L05144 . 1 1 HUMPHOCAR Homo sapiens (clone lamda-hPEC-3 ) 
phosphoenolpyruvate carboxykinase (PCKl) mRNA, complete cds 
TGGGAACACAAACTTGCTGGCGGGAAGAGCCCGGAAAGAAACCTGTGGATCTCCCTTCGAGATCATCCAA 

50 AGAGAAGAAAGGTGACCTCACATTCGTGCCCCTTAGCAGCACTCTGCAGAAATGCCTCCTCAGCTGCAAA 
ACGGCCTGAACCTCTCGGCCAAAGTTGTCCAGGGAAGCCTGGACAGCCTGCCCCAGGCAGTGAGGGAGTT 
TCTCGAGAATAACGCTGAGCTGTGTCAGCCTGATCACATCCACATCTGTGACGGCTCTGAGGAGGAGAAT 
GGGCGGCTTCTGGGCCAGATGGAGGAAGAGGGCATCCTCAGGCGGCTGAAGAAGTATGACAACTGCTGGT 
TGGCTCTCACTGACCCCAGGGATGTGGCCAGGATCGAAAGCAAGACGGTTATCGTCACCCAAGAGCAAAG 

55 AGACACAGTGCCCATCCCCAAAACAGGCCTCAGCCAGCTCGGTCGCTGGATGTCAGAGGAGGATTTTGAG 
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AAAGCGTTCAATGCCAGGTTCCCAGGGTGCATGAAAGGTCGCACCATGTACGTCATCCCATTCAGCATGG 
GGCCGCTGGGCTCACCTCTGTCGAAGATCGGCATCGAGCTGACGGATTCGCCCTACGTGGTGGCCAGCAT 
GCGGATCATGACGCGGATGGGCACGCCCGTCCTGGAAGCACTGGGCGATGGGGAGTTTGTCAAATGCCTC 
CATTCTGTGGGGTGCCCTCTGCCTTTACAAAAGCCTTTGGTCAACAACTGGCCCTGCAACCCGGAGCTGA 
5 CGCTCATCGCCCACCTGCCTGACCGCAGAGAGATCATCTCCTTTGGCAGTGGGTACGGCGGGAACTCGCT 
GCTCGGGAAGAAGTGCTTTGCTCTCAGGATGGCCAGCCGGCTGGCAGAGGAGGAAGGGTGGCTGGCAGAG 
CACATGCTGATTCTGGGTATAACCAACCCTGAGGGTGAGAAGAAGTACCTGGCGGCCGCATTTCCCAGCG 
CCTGCGGGAAGACCAACCTGGCCATGATGAACCCCAGCCTCCCCGGGTGGAAGGTTGAGTGCGTCGGGGA 
TGACATTGCCTGGATGAAGTTTGACGCACAAGGTCATTTAAGGGCCATCAACCCAGAAAATGGCTTTTTC 

10 GGTGTCGCTCCTGGGACTTCAGTGAAGACCAACCCCAATGCCATCAAGACCATCCAGAAGAACACAATCT 
TTACCAATGTGGCCGAGACCAGCGACGGGGGCGTTTACTGGGAAGGCATTGATGAGCCGCTAGCTTCAGG 
CGTCACCATCACGTCCTGGAAGAATAAGGAGTGGAGCTCAGAGGATGGGGAACCTTGTGCCCACCCCAAC 
TCGAGGTTCTGCACCCCTGCCAGCCAGTGCCCCATCATTGATGCTGCCTGGGAGTCTCCGGAAGGTGTTC 
CCATTGAAGGCATTATCTTTGGAGGCCGTAGACCTGCTGGTGTCCCTCTAGTCTATGAAGCTCTCAGCTG 

15 GCAACATGGAGTCTTTGTGGGGGCGGCCATGAGATCAGAGGCCACAGCGGCTGCAGAACATAAAGGCAAA 
ATCATCATGCATGACCCCTTTGCCATGCGGCCCTTCTTTGGCTACAACTTCGGCAAATACCTGGCCCACT 
GGCTTAGCATGGCCCAGCACCCAGCAGCCAAACTGCCCAAGATCTTCCATGTCAACTGGTTCCGGAAGGA 
CAAGGAAGGCAAATTCCTCTGGCCAGGCTTTGGAGAGAACTCCAGGGTGCTGGAGTGGATGTTCAACCGG 
ATCGATGGAAAAGCCAGCACCAACGTCACGCCCATAGGCTACATCCCCAAGGAGGATGCCCTGAACCTGA 

20 AAGGCCTGGGGCACATCAACATGATGGAGCTTTTCAGCATCTCCAAGGAATTCTGGGACAAGGAGGTGGA 
AGACATCGAGAAGTATCTGGTGGATCAAGTCAATGCCGACCTCCCCTGTGAAATCGAGAGAGAGATCCTT 
GCCTTGAAGCAAAGAATAAGCCAGATGTAATCAGGGCCTGAGAATAAGCCAGATGTAATCAGGGCCTGAG 
TGCTTTACCTTTAAAATCATTAAATTAAAATCCATAAGGTGCAGTAGGAGCAAGAGAGGGCAAGTGTTCC 
CAAATTGACGCCACCTAATAATCATCACCACACCGGGAGCAGATCTGAAGGCACACTTTGATTTTTTTAA 

25 GGATAAGAACCACAGAACACTGGGTAGTAGCTAATGAAATTGAGAAGGGAAATCTTAGCATGCCTCCAAA 
AATTCACATCCAATGCATACTTTGTTCAAATTTAAGGTTACTCAGGCATTGATCTTTTCAGTGTTTTTTC 
ACTTAGCTATGTGGATTAGCTAGAATGCACACCAAAAAGATACTTGAGCTGTATATATATATGTGTGTGT 
GTGTGTGTGTGTGTGTGTGTGTGCATGTATGTGCACATGTGTCTGTGTGATATTTGGTATGTGTATTTGT 
ATGTACTGTTATTCAAAATATATTTAATACCTTTGGAAAATCTTGGGCAAGATGACCTACTAGTTTTCCT 

30 TGAAAAAAAGTTGCTTTGTTATTAATATTGTGCTTAAATTATTTTTATACACCATTGTTCCTTACCTTTA 
CATAATTGCAATATTTCCCCCTTACTACTTCTTGGAAAAAAATTAGAAAATGAAGTTTATAGAAAAG 
{SEQ ID NO: 6697) 



35 gi I 6679892 1 ref I NM_008061 . 1 1 Mus musculus glucose-6~phosphatase, catalytic 
(G6pc) , iTiRNA 

AGCAGAGGGATCGGGGCCAACCGGGCTTGGACTCACTGCACGGGCTCTGCTGGCAGCTTCCTGAGGTACC 
AAGGGAGGAAGGATGGAGGAAGGAATGAACATTCTCCATGACTTTGGGATCCAGTCGACTCGCTATCTCC 
AAGTGAATTACCAAGACTCCCAGGACTGGTTCATCCTTGTGTCTGTGATTGCTGACCTGAGGAACGCCTT 

40 CTATGTCCTCTTTCCCATCTGGTTCCATCTTAAAGAGACTGTGGGCATCAATCTCCTCTGGGTGGCAGTG 
GTCGGAGACTGGTTCAACCTCGTCTTCZ\AGTGGATTCTGTTTGGACAACGCCCGTATTGGTGGGTCCTGG 
ACACCGACTACTACAGCAACAGCTCCGTGCCTATAATAAAGCAGTTCCCTGTCACCTGTGAGACCGGACC 
AGGAAGTCCCTCTGGCCATGCCATGGGCGCAGCAGGTGTATACTATGTTATGGTCACTTCTACTCTTGCT 
ATCTTTCGAGGAAAGAAAAAGCCAACGTATGGATTCCGGTGTTTGAACGTCATCTTGTGGTTGGGATTCT 

45 GGGCTGTGCAGCTGAACGTCTGTCTGTCCCGGATCTACCTTGCTGCTCACTTTCCCCACCAGGTCGTGGC 
TGGAGTCTTGTCAGGCATTGCTGTGGCTGAAACTTTCAGCCACATCCGGGGCATCTACAATGCCAGCCTC 
CGGAAGTATTGTCTCATCACCATCTTCTTGTTTGGTTTCGCGCTTGGATTCTACCTGCTACTT^AAAGGGC 
TAGGGGTGGACCTCCTGTGGACTTTGGAGAAAGCCAAGAGATGGTGTGAGCGGCCAGAATGGGTCCACCT 
TGACACTACACCCTTTGCCAGCCTCTTCAAAAACCTGGGAACCCTCTTGGGGTTGGGGCTGGCCCTCAAC 

50 TCCAGCATGTACCGGAAGAGCTGCAAGGGAGAACTCAGCAAGTCGTTCCCATTCCGCTTCGCCTGCATTG 
TGGCTTCCTTGGTCCTCCTGCATCTCTTTGACTCTCTGAAGCCCCCATCCCAGGTTGAGTTGATCTTCTA 
CATCTTGTCTTTCTGCAAGAGCGCAACAGTTCCCTTTGCATCTGTCAGTCTTATCCCATACTGCCTAGCC 
CGGATCCTGGGACAGACACACAAGAAGTCTTTGTAAGGCATGCAGAGTCTTTGGTATTTAAAGTCAACCG 
CCATGCAAAGGACTAGGAACAACTAAAGCCTCTGAAACCCATTGTGAGGCCAGAGGTGTTGACATCGGCC 

55 CTGGTAGCCCTGTCTTTCTTTGCTATCTTAACCAAAAGGTGAATTTTTACAAAGCTTACAGGGCTGTTTG 
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AGGAAAGTGTGAATGCTGGAAACTGAGTCATTCTGGATGGTTCCCTGAAGATTCGCTTACCAGCCTCCTG 
TCAGATACAGAAGAGCAAGCCCAGGCTAGAGATCCCAACTGAGAATGCTCTTGCGGTGCAGAATCTTCCG 
GCTGGGAAAAGGAAAAGAGCACCATGCATTTGCCAGGAAGAGAAAGAAGGATCGGGAGGAGGGAGAGTGT 
TTTATGTATCGAGCAAACCAGATGCAATCTATGTCTAACCGGCTTCAGTTGTGTCTGCGTCTTTAGATAC 
5 GACACACTCAATAATAATAATAGACCAACTAGTGTAATGAGTAGCCAGTTAAAGGCGATTAATTCTGCTT 
CCAGATAGTCTCCACTGTACATAAAAGTCACACTGTGTGCTTGCATTCCTGTATGGTAGTGGTGACTGTC 
TCTCACACCACCTTCTCTATCACGTCACAGTTTTCTCCTCCTCAGCCTATGTCTGCATTCCCCAGAATTC 
TCCACTTGTTCCCTGGCCCTGCTGCTGGACCCTGCTGTGTCTGGTAGGCAACTGTTTGTTGGTGCTTTTG 
TAGGGTTAAGTTAAACTCTGAGATCTTGGGCAAAATGGCAAGGAGACCCAGGATTCTTCTCTCCAAAGGT 
10 CACTCCGATGTTATTTTTGATTCCTGGGGCAGAAATATGACTCCTTTCCCTAGCCCAAGCCAGCCAAGAG 
CTCTCATTCTTAGAAGAAAAGGCAGCCCCTTGGTGCCTGTCCTCCTGCCTCGGCTGATTTGCAGAGTACT 
TCTTCAAAAAGAAAAAAATGGTAAAGCTATTTATTAAAAATTCTTTGTTTTTTGCTACAAATGATGCATA 
TATTTTCACCCACACCAAGCACTTTGTTTCTAATATCTTTGATAAGAAAACTACATGTGCAGTATTTTAT 
TAAAGCAACATTTTATTTA (SEQ ID NO: 6698) 

15 



gi I 7110682 1 ref I NM_011044 . 1 1 Mus inusculus phosphoenolpyruvate c a rboxy kinase 
1, cytosolic (Pckl), mRNA 

ACAGTTGGCCTTCCCTCTGGGAACACACCCTCGGTCAACAGGGGAAATCCGGCAAGGCGCTCAGCGATCT 

20 CTGATCCAGACCTTCCAAAAGGAAGAAAGGTGGCACCAGAGTTCCTGCCTCTCTCCACACCATTGCAATT 
ATGCCTCCTCAGCTGCATAACGGTCTGGACTTCTCTGCCAAGGTTATCCAGGGCAGCCTCGACAGCCTGC 
CCCAGGCAGTGAGGAAGTTCGTGGAAGGCAATGCTCAGCTGTGCCAGCCGGAGTATATCCACATCTGCGA 
TGGCTCCGAGGAGGAGTACGGGCAGTTGCTGGCCCACATGCAGGAGGAGGGTGTCATCCGCAAGCTGAAG 
AAATATGACAACTGTTGGCTGGCTCTCACTGACCCTCGAGATGTGGCCAGGATCGAAAGCAAGACAGTCA 

25 TCATCACCCAAGAGCAGAGAGACACAGTGCCCATCCCCAAAACTGGCCTCAGCCAGCTGGGCCGCTGGAT 
GTCGGAAGAGGACTTTGAGAAAGCATTCAACGCCAGGTTCCCAGGGTGCATGAAAGGCCGCACCATGTAT 
GTCATCCCATTCAGCATGGGGCCACTGGGCTCGCCGCTGGCCAAGATTGGTATTGAACTGACAGACTCGC 
CCTATGTGGTGGCCAGCATGCGGATCATGACTCGGATGGGCATATCTGTGCTGGAGGCCCTGGGAGATGG 
GGAGTTCATCAAGTGCCTGCACTCTGTGGGGTGCCCTCTCCCCTTAAAAAAGCCTTTGGTCAACAACTGG 

30 GCCTGCAACCCTGAGCTGACCCTGATCGCCCACCTCCCGGACCGCAGAGAGATCATCTCCTTTGGAAGCG 
GATATGGTGGGAACTCACTACTCGGGAAGAAATGCTTTGCGTTGCGGATCGCCAGCCGTCTGGCTAAGGA 
GGAAGGGTGGCTGGCGGAGCATATGCTGATCCTGGGCATAACTAACCCCGAAGGCAAGAAGAAATACCTG 
GCCGCAGCCTTCCCTAGTGCCTGTGGGAAGACTAACTTGGCCATGATGAACCCCAGCCTGCCCGGGTGGA 
AGGTCGAATGTGTGGGCGATGACATTGCCTGGATGAAGTTTGATGCCCAAGGCAACTTAAGGGCTATCAA 

35 CCCAGAAAACGGGTTTTTTGGAGTTGCTCCTGGCACCTCAGTGAAGACAAATCCAAATGCCATTAAAACC 
ATCCAGAAAAACACCATCTTCACCAACGTGGCCGAGACTAGCGATGGGGGTGTTTACTGGGAAGGCATCG 
ATGAGCCGCTGGCCCCGGGAGTCACCATCACCT.CCTGGAAGAACAAGGAGTGGAGACCGCAGGACGCGGA 
ACCATGTGCCCATCCCAACTCGAGATTCTGCACCCCTGCCAGCCAGTGCCCCATTATTGACCCTGCCTGG 
GAATCTCCAGAAGGAGTACCCATTGAGGGTATCATCTTTGGTGGCCGTAGACCTGAAGGTGTCCCCCTTG 

40 TCTATGAAGCCCTCAGCTGGCAGCATGGGGTGTTTGTAGGAGCAGCCATGAGATCTGAGGCCACAGCTGC 
TGCAGAACACAAGGGCAAGATCATCATGCACGACCCCTTTGCCATGCGACCCTTCTTCGGCTACAACTTC 
GGCAAATACCTGGCCCACTGGCTGAGCATGGCCCACCGCCCAGCAGCCAAGTTGCCCAAGATCTTCCATG 
TCAACTGGTTCCGGAAGGACAAAGATGGCAAGTTCCTCTGGCCAGGCTTTGGCGAGAACTCCCGGGTGCT 
GGAGTGGATGTTCGGGCGGATTGAAGGGGAAGACAGCGCCAAGCTCACGCCCATCGGCTACATCCCTAAG 

45 GAAAACGCCTTGAACCTGAAAGGCCTGGGGGGCGTCAACGTGGAGGAGCTGTTTGGGATCTCTAAGGAGT 
TCTGGGAGAAGGAGGTGGAGGAGATCGACAGGTATCTGGAGGACCAGGTCAACACCGACCTCCCTTACGA 
AATTGAGAGGGAGCTCCGAGCCCTGAAACAGAGAATCAGCCAGATGTAAATCCCAATGGGGGCGTCTCGA 
GAGTCACCCCTTCCCACTCACAGCATCGCTGAGATCTAGGAGAAAGCCAGCCTGCTCCAGCTTTGAGATA 
GCGGCACAATCGTGAGTAGATCAGAAAAGCACCTTTTAATAGTCAGTTGAGTAGCACAGAGAACAGGCTA 

50 GGGGCAAATAAGATTGGGAGGGGAAATCACCGCATAGTCTCTGAAGTTTGCATTTGACACCAATGGGGGT 
TTTGGTTCCACTTCAAGGTCACTCAGGAATCCAGTTCTTCACGTTAGCTGTAGCAGTTAGCTAAAATGCA 
CAGAAAACATACTTGAGCTGTATATATGTGTGTGAACGTGTCTCTGTGTGAGCATGTGTGTGTGTGTGTG 
TGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTACATGCCTGTCTGTCCCATTGTCCACAGTATATTTAA 
AACCTTTGGGGAAAAATCTTGGGCAAATTTGTAGCTGTAACTAGAGAGTCATGTTGCTTTGTTGCTAGTA 

55 TGTATGTTTAAATTATTTTTATACACCGCCCTTACCTTTCTTTACATAATTGAAATTGGTATCCGGACCA 
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CTTCTTGGGAAAAAAATTACAAAATAT^ (SEQ ID NO: 6699) 
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Example 6 siRNAs decrease inRNA levels in vivo 

Male CMV-Luc mice (8-10 weeks old) from Xenogen (Cranbury, NJ) were 
administered cholesterol conjugated siRNA (see Table 16). 

5 Table 16. Solutions adminstered to mice 



Group 


u 


Injection Mis 


1 


7 


Buffer (PBS [pH 7.4]) 


2 


8 


Cholesterol conjugated siRNA 
(ALN-3001) 



Table 1 7. Test iRNA agents targeting Luciferase 



glRNA 


Sequence 


ALN-1070 


5'-GAA CUG UGU GUG AGA GGU CCU-3' (SEQ ID NO: 6700) 
3'-CG CUU GAG ACA CAC UCU CCA GGA-5' (SEQ ID NO: 6701) 


ALN-1000 


5'-GAA CUG UGU GUG AGA GGU CCU-GS-3' (SEQ ID NO: 6702) 
3'-CG CUU GAG ACA CAC UCU CCA GGA--5' (SEQ ID NO: 6703) 


ALN-3000 


5'"GAA CUG UGU GUG AGA GGU CCU-3' (SEQ ID NO: 6704) 
3'-Cs^Gs^ CUU GAC ACA CAC UCU CCA GGA-5' (SEQ ID NO: 6705) 


ALN-3001 


5'-GAA CUG UGU GUG AGA GGU CCU-chol . ^-3 ' (SEQ ID NO: 6706) 
3'-Cs^Gs^ CUU GAC ACA CAC UCU CCA GGA-5' (SEQ ID NO: 67 07) 



2' O-Me group is attached to the nucleotide and the nucleotides have phosphorothioate linkages 
10 (indicated by "s") 

2 cholesterol is conjugated to the antisense strand via the linker: U-pyrroline carrier-C(0)-(CH2)5- 
NHC(0)-cholesterol (via cholesterol C-3 hydroxyl). 

Animals were injected (tail vein) with a volume of 200-250 ^il test solution containing 
15 buffer or an siRNA solution. Group 1 received buffer and group 2 received cholesterol 

conjugated siRNA (ALN-3001) at a dose of 50 mg/kg body weight. Twenty-two hours after 
injection, animals were sacrificed and livers collected. Organs were snap frozen on dry ice, 
then pulverized in a mortar and pestle. 

For Luciferase mRNA analysis (by the QuantiGene Assay (Genospectra, Inc.; 
20 Fremont, C A)), approximately 1 0 mg of tissue powder was resuspended in tissue lysis buffer, 
and processed according to the manufacturer's protocol. Samples of the lysate were 
hybridized with probes specific for Luciferase or GAPDH (designed using ProbeDesigner 
software (Genospectra, Inc., Fremont, CA) in triplicate, and processed for kmiinometric 
analysis. Values for Luciferase were normalized to GAPDH. Mean values were plotted with 
25 error bars corresponding to the standard deviation of the Luciferase measurements. 
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Results indicated that the level of luciferase RNA in animals injected with cholesterol 
conjugated siRNA was reduced by about 70% as compared to animals injected with buffer 
(see FIGs 6 A and 6b). 

In Vitro Activity 

HeLa cells expressing luciferase were transfected with each of the siRNAs listed in 
Table 17. ALN-1000 siRNAs were most effective at decreasing luciferase mRNA levels 
(~0.6 iiM siRNA decreased mRNA levels to about -'65% the original expression level, and 
1 .0 nM siRNA decreased levels to about -20% the original expression level); ALN-3001 
siRNAs were least effective (-0.6 nM siRNA had a negligible mRNA levels, and 1 .0 nM 
siRNA decreased levels to about -40% the original expression level). 

Pharmacokinetics/Biodistribution 

Pharmacokinetic analyses were performed in mice and rats. Test siRNA molecules 
were radioactively labeled with ^^P on the antisense strand by splint ligation. Labeled 
siRNAs (50mg/kg) were administered by tail vein injection, and plasma levels of siRNA 
were measured periodically over 24 hrs by scintillation counting. Cholesterol conjugated 
siRNA (ALN-3001) was discovered to circulate in mouse plasma for a longer period time 
than unconjugated siRNA (ALN-3000) (FIG. 7). RNAse protection assays indicated tliat 
cholesterol-conjugated siRNA (ALN-3001) was detectable in mouse plasma 12 hours after 
injection^ whereas imconjugated siRNA (ALN-3000) was not detectable in mouse plasma 
within two hours following injection. Similar results were observed in rats. 

Mouse liver was harvested at varying time points (ranging from 0.08-24 hours) 
following injection with siRNA, and siRNA localized to the liver was quantified. Over the 
time period tested, the amount of cholesterol-conjugated siRNA (ALN-3001) detected in the 
liver ranged from 14.3-3.55 percent of the total dose administered to the mouse. The amount 
of miconjugated siRNA (ALN-3000) detected in the liver was lower, ranghig from 3.91— 
1.75 percent of the total dose administered. 
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Detection of siRNA in Different Tissues 

Various tissues and organs (fat, heart, kidney, liver, and spleen) were harvested from 
two CMV-Luc mice 22 hours following injection with 50 mg/kg ALN-3001 . The antisense 
strand of the siRNA was detected by RNAse protection assay. The liver contained the 

greatest concentration of siRNA (-8-10 |ig siRNA/g tissue); the spleen, heart and kidney 
contained lesser amounts of siRNA (-2-7 \xg siRNA/g tissue); and fat tissue contained the 
least amount of siRNA (<-l |ig siRNA/g tissue). 

Glucose-6-phosphatase siRNA detection by RNAse Protection Assay 
Balbc mice were injected with U/U, 3 ^C/U, or 3 ' C/3 ^ C siRNA (4 mg/kg) targeting 
glucose~6-phosphatase (G6Pase) (see Table 18). Administration was by hydrodynamic tail 
vein injection (hd) or non-hydrodynamic tail vein injection (iv), and siRNA was 
subsequently detected in the liver by RNAse protection assay. 



Table 18. Test iRNA agents targeting glucose-6-phQsphatase 



siRNA 


Description 


U/U 


No cholesterol; dinucleotide 3 ' overhangs on sense and antisense strands 


3'C/U 


dinucleotide 3' overhangs on sense and antisense strands; cholesterol 
conjugated to 3" end of sense strand (mono-conjugate) 


3 'C/3 'C 


dinucleotide 3' overhangs on sense and antisense strands; cholesterol 
conjugated to 3' end of both sense and antisense strands (bis-conjugate) 



Unconjugated siRNA (U/U) delivered by hd was detected by 15 min. post-injection 
(tlie earliest determined time-point) and was still detectable in the liver 1 8 hours post- 
injection. 

Delivery by normal iv administration resulted in the greatest concentration of 3 'C/3 'C 
siRNA (the bis-cholesterol-conjugate) in the liver 1 hour post injection (as compared to the 
mono-cholesterol-conjugate 3'C/3'U siRNA). At 18 hours post injection, 3'C/3'C siRNAs 
and 3 'C/U siRNA were still detectable in the liver with the bis-conjugate at higher levels 
compared to the mono-conjugate. 

While this invention has been particularly shown and described with reference to 
preferred embodiments thereof, it will be understood by those skilled in tlie art that various 
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changes in form and details may be made therein without departing from the scope of the 
invention encompassed by tlie appended claims. 
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WHAT IS CLAIMED IS: 

1 . An iRNA agent comprising a sense sequence and an antisense sequence, wlierein 
5 the sense sequence has one or more asymmetrical 2'-0 alkyl modifications and the antisense 

sequence has one or more asymmetrical phosphorothioate modifications, and the antisense 

sequence targets a human gene sequence. 

2. The iRNA agent of claim 1, wherein at least one of said 2'-0-alkyl modifications 
10 is a 2'-OMe modification. 

3. The iRNA agent of claim 1, wherein the sense sequence has at le£ist 2 
asymmetrical 2'-0 alkyl modifications. 

15 4. The iRNA agent of claim 1, wherein the sense has at least 4 asymmetrical 2'-0 

alkyl modifications. ^ 

5. The iRNA agent of claim 4, wherein the asymmetrical modifications are 2'-OMe 
modifications. 

20 

6. The iRNA agent of claim 1, wherein the sense sequence has at least 6 
asymmetrical 2'-0 alkyl modifications. 

7. The iRNA agent of claim 6, wherein the asymmetrical modifications are 2'-OMe 
25 modifications. 

8. The iRNA agent of claim 1^ wherein the sense sequence has at least 8 
asymmetrical 2'-0 alkyl modifications. 

30 9. The iRNA agent of claim 8, wherein the asymmetrical modifications are 2'-OMe 

modifications. 
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10. The iRNA agent of claim 1, wherein all of the subunits of the sense sequence 
have an asymmetrical 2'-0 alkyl modification. 

1 1 . The iRNA agent of claim 10, wherein the asymmetrical modifications are 2'-OMe 
modifications. 

12. The iRNA agent of claim 1, wherein the antisense sequence has at least 2 
asymmetrical phosphorothioate modifications. 

13. The iRNA agent of claim 1, wherein the antisense sequence has at least 4 
asymmetrical phosphorothioate modifications. 

14. The iRNA agent of claim 1, wherein the antisense sequence has at least 6 
asjonmetrical phosphorothioate modifications. 

1 5 . The iRNA agent of claim 1 , wherein the antisense sequence has at least 8 
asymmetrical phosphorothioate modifications. 

1 6. The iRNA agent of claim 1 , wherein all of the subunits of the sense sequence 
have an asymmetrical phosphorothioate modification, 

17. The iRNA agent of claim 1, wherein the sense and antisense sequences are on 
different RNA strands. 

1 8. The iRNA agent of claim 1, wherein the sense and antisense sequences are on the 
same RNA strand. 

19. The iRNA agent of claim 1, wherem the sense and antisense sequences are fully 
complementary to each other. 
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20. The iRNA agent of claim 1, further comprising a cholesterol moiety. 

21 . The iRNA agent of claim 20, wherein said cholesterol moiety is coupled to a 
sense strand. 

5 

22. The iRNA agent of claim 203 ftirther comprising a second cholesterol moiety. 

23. The iRNA agent of claim 22, wherein said second cholesterol moiety is coupled 
to a sense strand. 

10 

24. The iRNA agent of claim 1 , wherein said human gene is an oncogene. 

25. The iRNA agent of claim 1, wherein said human gene is the apoB-100 gene. 

15 26. The iRNA agent of claim 1, wherein said human gene is the glucose-6- 

phosphatase gene. 

27. The iRNA agent of claim 1, wherein the said human gene is the beta catenin 

gene. 

20 

28. The iRNA agent of claim 1, wherein the iRNA agent is at least 21 nucleotides in 
length, and the duplex region of the iRNA is about 1 9 nucleotides in length. 

29. The iRNA agent of claim 1, having a duplex region of about 19 subunits in 
25 length and one or two 3 ' overhangs of about 2 subunits in length. 

30. A pharmaceutical preparation comprising the iRNA agent of claim 1 . 

31. A method for reducing apoB-100 levels in a subject comprising administering to 
30 a subject an iRNA agent comprising a sense strand sequence and an antisense sequence, 

wherein the sense sequence has at least 4 asymmetrical 2'-0 alkyl modifications and the 
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antisense sequence has at least 4 asymmetrical phosphorothioate modifications, and the 
antisense sequence taigets apoB-100. 

32. The method of claim 31, wherein the subject is suffermg from a disorder 
characterized by elevated or otherwise unwanted expression of apoB-100, elevated or 
otherwise unwanted levels of cholesterol, and/or disregulation of lipid metaboUsm. 

33. The method of claim 32, wherein said disorder is chosen form the group of 
HDL/LDL cholesterol imbalance; dyslipidemias; hypercholestorolemia; statin-resistant 
hypercholesterolemia; coronary artery disease (CAD) coronary heart disease (CHD) 
atherosclerosis 

34. A method for reducing glucose-6-phosphatase levels in a subject comprising 
administering to a subject an iRNA agent comprising a sense strand sequence and an 
antisense sequence, wherein the sense sequence has at least 4 asymmetrical 2'-0 alkyl 
modifications and the antiserise sequence has at least 4 asymmetrical phosphorothioate 
modifications, and the antisense sequence targets glucose-6-phosphatase, 

35. The method of claim 34, wherein the iRNA agent is administered to a subject to 
inhibit hepatic glucose production, or for the treatment of a glucose-metabolism-related 
disorder. 

36. The method of claim 35, wherein said disorder is diabetes. 

37. The method of claim 35, wherein said disorder is type-2 diabetes. 

38. A method of making an iRNA agent, the method comprising: 
providing a sense strand sequence having at least 4 asymmetrical 2'-0 alkyl 

modifications and an antisense sequence having at least 4 asymmetrical phosphorothioate 
modifications, and allowing the sense and antisense strand to hybridize. 
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39. A method of stabilizing an iRNA agent, comprising selecting a sequence with 
activity, and introducing one or more asymmetrical modification in said sequence, wherein 
said modification decreases nuclease sensitivity while not decreasing activity. 
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